Overview

Brought to you by YData

Dataset statistics

Number of variables71
Number of observations186529
Missing cells5217520
Missing cells (%)39.4%
Total size in memory101.0 MiB
Average record size in memory568.0 B

Variable types

Text71

Dataset

DescriptionBotany Division, Yale Peabody Museum 0061682-241126133413365
URLhttps://doi.org/10.15468/dl.twf535

Alerts

accessRights has constant value "Open Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj" Constant
license has constant value "http://creativecommons.org/publicdomain/zero/1.0/" Constant
rightsHolder has constant value "Yale Peabody Museum" Constant
institutionCode has constant value "YPM" Constant
ownerInstitutionCode has constant value "YPM" Constant
basisOfRecord has constant value "PreservedSpecimen" Constant
individualCount has constant value "1" Constant
preparations has constant value "tissue (frozen)" Constant
disposition has constant value "in collection" Constant
subgenus has constant value "Cyclosorus" Constant
nomenclaturalCode has constant value "ICBN" Constant
taxonRemarks has constant value "Animals and Plants: Plants" Constant
recordNumber has 139017 (74.5%) missing values Missing
recordedBy has 75764 (40.6%) missing values Missing
reproductiveCondition has 186504 (> 99.9%) missing values Missing
preparations has 186476 (> 99.9%) missing values Missing
associatedMedia has 9347 (5.0%) missing values Missing
associatedReferences has 176462 (94.6%) missing values Missing
associatedTaxa has 185782 (99.6%) missing values Missing
eventDate has 82488 (44.2%) missing values Missing
year has 82575 (44.3%) missing values Missing
month has 92324 (49.5%) missing values Missing
day has 103361 (55.4%) missing values Missing
habitat has 157729 (84.6%) missing values Missing
higherGeography has 72099 (38.7%) missing values Missing
continent has 73523 (39.4%) missing values Missing
waterBody has 183495 (98.4%) missing values Missing
country has 72769 (39.0%) missing values Missing
stateProvince has 78016 (41.8%) missing values Missing
county has 98586 (52.9%) missing values Missing
municipality has 110052 (59.0%) missing values Missing
locality has 125307 (67.2%) missing values Missing
minimumElevationInMeters has 178933 (95.9%) missing values Missing
maximumElevationInMeters has 185793 (99.6%) missing values Missing
verbatimElevation has 178933 (95.9%) missing values Missing
decimalLatitude has 82100 (44.0%) missing values Missing
decimalLongitude has 82100 (44.0%) missing values Missing
geodeticDatum has 82283 (44.1%) missing values Missing
coordinateUncertaintyInMeters has 82137 (44.0%) missing values Missing
georeferencedBy has 182211 (97.7%) missing values Missing
georeferencedDate has 174887 (93.8%) missing values Missing
georeferenceProtocol has 82331 (44.1%) missing values Missing
georeferenceSources has 83888 (45.0%) missing values Missing
georeferenceRemarks has 85474 (45.8%) missing values Missing
typeStatus has 182607 (97.9%) missing values Missing
identifiedBy has 180415 (96.7%) missing values Missing
dateIdentified has 184582 (99.0%) missing values Missing
identificationRemarks has 182833 (98.0%) missing values Missing
phylum has 28457 (15.3%) missing values Missing
class has 132536 (71.1%) missing values Missing
order has 34477 (18.5%) missing values Missing
family has 28617 (15.3%) missing values Missing
genus has 28567 (15.3%) missing values Missing
subgenus has 186528 (> 99.9%) missing values Missing
specificEpithet has 52532 (28.2%) missing values Missing
infraspecificEpithet has 185713 (99.6%) missing values Missing
scientificNameAuthorship has 36820 (19.7%) missing values Missing
gbifID has unique values Unique
bibliographicCitation has unique values Unique
references has unique values Unique
dynamicProperties has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-14 16:26:50.878058
Analysis finished2025-01-14 16:27:01.550083
Duration10.67 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:01.773352image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1865290
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st row1038985783
2nd row1038985820
3rd row1038985793
4th row1805296727
5th row4539832816
ValueCountFrequency (%)
1038985783 1
 
< 0.1%
1038985974 1
 
< 0.1%
1038985864 1
 
< 0.1%
1805437104 1
 
< 0.1%
1038985828 1
 
< 0.1%
1038985793 1
 
< 0.1%
1805296727 1
 
< 0.1%
4539832816 1
 
< 0.1%
1038985782 1
 
< 0.1%
1038985792 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-14T11:27:02.078473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1865290
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1865290
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1865290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

accessRights
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:02.156298image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length129
Median length129
Mean length129
Min length129

Characters and Unicode

Total characters24062241
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
2nd rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
3rd rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
4th rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
5th rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
ValueCountFrequency (%)
open 186529
11.1%
access 186529
11.1%
http://creativecommons.org/publicdomain/zero/1.0 186529
11.1%
see 186529
11.1%
yale 186529
11.1%
peabody 186529
11.1%
policies 186529
11.1%
at 186529
11.1%
http://hdl.handle.net/10079/8931zqj 186529
11.1%
2025-01-14T11:27:02.272633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2238348
 
9.3%
/ 1865290
 
7.8%
1492232
 
6.2%
t 1305703
 
5.4%
o 1305703
 
5.4%
a 1119174
 
4.7%
c 1119174
 
4.7%
i 932645
 
3.9%
n 932645
 
3.9%
s 932645
 
3.9%
Other values (28) 10818682
45.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16228023
67.4%
Other Punctuation 3544051
 
14.7%
Decimal Number 2051819
 
8.5%
Space Separator 1492232
 
6.2%
Uppercase Letter 746116
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2238348
13.8%
t 1305703
 
8.0%
o 1305703
 
8.0%
a 1119174
 
6.9%
c 1119174
 
6.9%
i 932645
 
5.7%
n 932645
 
5.7%
s 932645
 
5.7%
l 932645
 
5.7%
p 932645
 
5.7%
Other values (12) 4476696
27.6%
Decimal Number
ValueCountFrequency (%)
1 559587
27.3%
0 559587
27.3%
9 373058
18.2%
8 186529
 
9.1%
7 186529
 
9.1%
3 186529
 
9.1%
Other Punctuation
ValueCountFrequency (%)
/ 1865290
52.6%
. 746116
 
21.1%
: 559587
 
15.8%
; 186529
 
5.3%
, 186529
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
P 186529
25.0%
O 186529
25.0%
Y 186529
25.0%
A 186529
25.0%
Space Separator
ValueCountFrequency (%)
1492232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16974139
70.5%
Common 7088102
29.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2238348
13.2%
t 1305703
 
7.7%
o 1305703
 
7.7%
a 1119174
 
6.6%
c 1119174
 
6.6%
i 932645
 
5.5%
n 932645
 
5.5%
s 932645
 
5.5%
l 932645
 
5.5%
p 932645
 
5.5%
Other values (16) 5222812
30.8%
Common
ValueCountFrequency (%)
/ 1865290
26.3%
1492232
21.1%
. 746116
 
10.5%
: 559587
 
7.9%
1 559587
 
7.9%
0 559587
 
7.9%
9 373058
 
5.3%
8 186529
 
2.6%
7 186529
 
2.6%
3 186529
 
2.6%
Other values (2) 373058
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24062241
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2238348
 
9.3%
/ 1865290
 
7.8%
1492232
 
6.2%
t 1305703
 
5.4%
o 1305703
 
5.4%
a 1119174
 
4.7%
c 1119174
 
4.7%
i 932645
 
3.9%
n 932645
 
3.9%
s 932645
 
3.9%
Other values (28) 10818682
45.0%
Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:02.510454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length55
Mean length28.163299
Min length15

Characters and Unicode

Total characters5253272
Distinct characters68
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowLuzula bulbosa (YU.036650)
2nd rowGentiana clausa (CBS.028950)
3rd rowCarex muhlenbergii (YU.070008)
4th rowLophocolea minor (YU.204399)
5th rowPlantae (YU.175465)
ValueCountFrequency (%)
plantae 28374
 
5.5%
carex 8803
 
1.7%
var 3699
 
0.7%
dryopteris 2392
 
0.5%
sphagnum 2360
 
0.5%
juncus 1814
 
0.4%
frullania 1708
 
0.3%
asplenium 1557
 
0.3%
scapania 1517
 
0.3%
canadensis 1515
 
0.3%
Other values (197634) 462834
89.6%
2025-01-14T11:27:02.818585image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 389969
 
7.4%
330044
 
6.3%
i 262458
 
5.0%
0 223252
 
4.2%
e 205819
 
3.9%
l 196972
 
3.7%
. 190701
 
3.6%
( 186530
 
3.6%
) 186530
 
3.6%
r 175234
 
3.3%
Other values (58) 2905763
55.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2641527
50.3%
Decimal Number 1119258
21.3%
Uppercase Letter 597975
 
11.4%
Space Separator 330044
 
6.3%
Other Punctuation 190703
 
3.6%
Open Punctuation 186530
 
3.6%
Close Punctuation 186530
 
3.6%
Dash Punctuation 705
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 389969
14.8%
i 262458
9.9%
e 205819
 
7.8%
l 196972
 
7.5%
r 175234
 
6.6%
n 170772
 
6.5%
u 162576
 
6.2%
o 156926
 
5.9%
s 154857
 
5.9%
t 146008
 
5.5%
Other values (16) 619936
23.5%
Uppercase Letter
ValueCountFrequency (%)
U 149178
24.9%
Y 148189
24.8%
C 64443
10.8%
S 55381
 
9.3%
P 49351
 
8.3%
B 44799
 
7.5%
A 13951
 
2.3%
L 10862
 
1.8%
D 7781
 
1.3%
R 6989
 
1.2%
Other values (16) 47051
 
7.9%
Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Other Punctuation
ValueCountFrequency (%)
. 190701
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
330044
100.0%
Open Punctuation
ValueCountFrequency (%)
( 186530
100.0%
Close Punctuation
ValueCountFrequency (%)
) 186530
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 705
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3239502
61.7%
Common 2013770
38.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 389969
 
12.0%
i 262458
 
8.1%
e 205819
 
6.4%
l 196972
 
6.1%
r 175234
 
5.4%
n 170772
 
5.3%
u 162576
 
5.0%
o 156926
 
4.8%
s 154857
 
4.8%
U 149178
 
4.6%
Other values (42) 1214741
37.5%
Common
ValueCountFrequency (%)
330044
16.4%
0 223252
11.1%
. 190701
9.5%
( 186530
9.3%
) 186530
9.3%
2 150900
7.5%
1 131446
 
6.5%
3 102608
 
5.1%
4 92196
 
4.6%
5 86628
 
4.3%
Other values (6) 332935
16.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5253272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 389969
 
7.4%
330044
 
6.3%
i 262458
 
5.0%
0 223252
 
4.2%
e 205819
 
3.9%
l 196972
 
3.7%
. 190701
 
3.6%
( 186530
 
3.6%
) 186530
 
3.6%
r 175234
 
3.3%
Other values (58) 2905763
55.3%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:02.888208image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length49
Mean length49
Min length49

Characters and Unicode

Total characters9139921
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttp://creativecommons.org/publicdomain/zero/1.0/
2nd rowhttp://creativecommons.org/publicdomain/zero/1.0/
3rd rowhttp://creativecommons.org/publicdomain/zero/1.0/
4th rowhttp://creativecommons.org/publicdomain/zero/1.0/
5th rowhttp://creativecommons.org/publicdomain/zero/1.0/
ValueCountFrequency (%)
http://creativecommons.org/publicdomain/zero/1.0 186529
100.0%
2025-01-14T11:27:02.998687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 1119174
 
12.2%
o 932645
 
10.2%
m 559587
 
6.1%
c 559587
 
6.1%
r 559587
 
6.1%
e 559587
 
6.1%
t 559587
 
6.1%
i 559587
 
6.1%
. 373058
 
4.1%
n 373058
 
4.1%
Other values (14) 2984464
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7088102
77.6%
Other Punctuation 1678761
 
18.4%
Decimal Number 373058
 
4.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 932645
13.2%
m 559587
 
7.9%
c 559587
 
7.9%
r 559587
 
7.9%
e 559587
 
7.9%
t 559587
 
7.9%
i 559587
 
7.9%
n 373058
 
5.3%
a 373058
 
5.3%
p 373058
 
5.3%
Other values (9) 1678761
23.7%
Other Punctuation
ValueCountFrequency (%)
/ 1119174
66.7%
. 373058
 
22.2%
: 186529
 
11.1%
Decimal Number
ValueCountFrequency (%)
1 186529
50.0%
0 186529
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7088102
77.6%
Common 2051819
 
22.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 932645
13.2%
m 559587
 
7.9%
c 559587
 
7.9%
r 559587
 
7.9%
e 559587
 
7.9%
t 559587
 
7.9%
i 559587
 
7.9%
n 373058
 
5.3%
a 373058
 
5.3%
p 373058
 
5.3%
Other values (9) 1678761
23.7%
Common
ValueCountFrequency (%)
/ 1119174
54.5%
. 373058
 
18.2%
1 186529
 
9.1%
: 186529
 
9.1%
0 186529
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9139921
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 1119174
 
12.2%
o 932645
 
10.2%
m 559587
 
6.1%
c 559587
 
6.1%
r 559587
 
6.1%
e 559587
 
6.1%
t 559587
 
6.1%
i 559587
 
6.1%
. 373058
 
4.1%
n 373058
 
4.1%
Other values (14) 2984464
32.7%
Distinct7024
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:03.128503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters4476696
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1517 ?
Unique (%)0.8%

Sample

1st row2023-03-01T19:35:25.000Z
2nd row2020-10-02T23:17:12.000Z
3rd row2020-12-23T21:50:47.000Z
4th row2020-06-26T23:18:45.000Z
5th row2024-03-19T11:52:47.000Z
ValueCountFrequency (%)
2015-11-29t17:24:32.000z 16880
 
9.0%
2020-12-23t21:50:47.000z 9978
 
5.3%
2020-08-11t23:38:35.000z 9456
 
5.1%
2020-10-02t23:17:12.000z 6413
 
3.4%
2022-03-19t21:48:41.000z 5153
 
2.8%
2015-11-29t17:24:36.000z 5077
 
2.7%
2019-12-07t23:19:07.000z 4868
 
2.6%
2015-11-28t13:37:37.000z 3604
 
1.9%
2015-11-28t13:37:48.000z 3531
 
1.9%
2024-03-20t22:00:25.000z 3149
 
1.7%
Other values (7014) 118420
63.5%
2025-01-14T11:27:03.327984image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1059689
23.7%
2 718500
16.0%
1 466016
10.4%
- 373058
 
8.3%
: 373058
 
8.3%
3 227531
 
5.1%
4 206600
 
4.6%
T 186529
 
4.2%
. 186529
 
4.2%
Z 186529
 
4.2%
Other values (5) 492657
11.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3170993
70.8%
Other Punctuation 559587
 
12.5%
Dash Punctuation 373058
 
8.3%
Uppercase Letter 373058
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1059689
33.4%
2 718500
22.7%
1 466016
14.7%
3 227531
 
7.2%
4 206600
 
6.5%
5 159192
 
5.0%
7 107197
 
3.4%
8 80992
 
2.6%
9 79600
 
2.5%
6 65676
 
2.1%
Other Punctuation
ValueCountFrequency (%)
: 373058
66.7%
. 186529
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4103638
91.7%
Latin 373058
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1059689
25.8%
2 718500
17.5%
1 466016
11.4%
- 373058
 
9.1%
: 373058
 
9.1%
3 227531
 
5.5%
4 206600
 
5.0%
. 186529
 
4.5%
5 159192
 
3.9%
7 107197
 
2.6%
Other values (3) 226268
 
5.5%
Latin
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4476696
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1059689
23.7%
2 718500
16.0%
1 466016
10.4%
- 373058
 
8.3%
: 373058
 
8.3%
3 227531
 
5.1%
4 206600
 
4.6%
T 186529
 
4.2%
. 186529
 
4.2%
Z 186529
 
4.2%
Other values (5) 492657
11.0%

references
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:03.505323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length59
Mean length59.20648264
Min length59

Characters and Unicode

Total characters11043726
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowhttp://collections.peabody.yale.edu/search/Record/YU.036650
2nd rowhttp://collections.peabody.yale.edu/search/Record/CBS.028950
3rd rowhttp://collections.peabody.yale.edu/search/Record/YU.070008
4th rowhttp://collections.peabody.yale.edu/search/Record/YU.204399
5th rowhttp://collections.peabody.yale.edu/search/Record/YU.175465
ValueCountFrequency (%)
http://collections.peabody.yale.edu/search/record/yu.036650 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.065082 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.065678 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.234842 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.012442 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.070008 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.204399 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.175465 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.060443 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.038995 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-14T11:27:03.743731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1119174
 
10.1%
/ 932645
 
8.4%
. 746144
 
6.8%
c 746116
 
6.8%
o 746116
 
6.8%
l 559587
 
5.1%
a 559587
 
5.1%
t 559587
 
5.1%
d 559587
 
5.1%
h 373058
 
3.4%
Other values (25) 4142125
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7461160
67.6%
Other Punctuation 1865318
 
16.9%
Decimal Number 1119258
 
10.1%
Uppercase Letter 597990
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1119174
15.0%
c 746116
10.0%
o 746116
10.0%
l 559587
 
7.5%
a 559587
 
7.5%
t 559587
 
7.5%
d 559587
 
7.5%
h 373058
 
5.0%
y 373058
 
5.0%
p 373058
 
5.0%
Other values (6) 1492232
20.0%
Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
R 186529
31.2%
Y 148126
24.8%
U 148126
24.8%
C 38403
 
6.4%
B 38403
 
6.4%
S 38403
 
6.4%
Other Punctuation
ValueCountFrequency (%)
/ 932645
50.0%
. 746144
40.0%
: 186529
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8059150
73.0%
Common 2984576
 
27.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1119174
13.9%
c 746116
 
9.3%
o 746116
 
9.3%
l 559587
 
6.9%
a 559587
 
6.9%
t 559587
 
6.9%
d 559587
 
6.9%
h 373058
 
4.6%
y 373058
 
4.6%
p 373058
 
4.6%
Other values (12) 2090222
25.9%
Common
ValueCountFrequency (%)
/ 932645
31.2%
. 746144
25.0%
0 223252
 
7.5%
: 186529
 
6.2%
2 150900
 
5.1%
1 131446
 
4.4%
3 102608
 
3.4%
4 92196
 
3.1%
5 86628
 
2.9%
8 85549
 
2.9%
Other values (3) 246679
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11043726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1119174
 
10.1%
/ 932645
 
8.4%
. 746144
 
6.8%
c 746116
 
6.8%
o 746116
 
6.8%
l 559587
 
5.1%
a 559587
 
5.1%
t 559587
 
5.1%
d 559587
 
5.1%
h 373058
 
3.4%
Other values (25) 4142125
37.5%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:03.801445image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters3544051
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYale Peabody Museum
2nd rowYale Peabody Museum
3rd rowYale Peabody Museum
4th rowYale Peabody Museum
5th rowYale Peabody Museum
ValueCountFrequency (%)
yale 186529
33.3%
peabody 186529
33.3%
museum 186529
33.3%
2025-01-14T11:27:03.898998image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 559587
15.8%
a 373058
10.5%
373058
10.5%
u 373058
10.5%
Y 186529
 
5.3%
l 186529
 
5.3%
P 186529
 
5.3%
b 186529
 
5.3%
o 186529
 
5.3%
d 186529
 
5.3%
Other values (4) 746116
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2611406
73.7%
Uppercase Letter 559587
 
15.8%
Space Separator 373058
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 559587
21.4%
a 373058
14.3%
u 373058
14.3%
l 186529
 
7.1%
b 186529
 
7.1%
o 186529
 
7.1%
d 186529
 
7.1%
y 186529
 
7.1%
s 186529
 
7.1%
m 186529
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%
Space Separator
ValueCountFrequency (%)
373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3170993
89.5%
Common 373058
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 559587
17.6%
a 373058
11.8%
u 373058
11.8%
Y 186529
 
5.9%
l 186529
 
5.9%
P 186529
 
5.9%
b 186529
 
5.9%
o 186529
 
5.9%
d 186529
 
5.9%
y 186529
 
5.9%
Other values (3) 559587
17.6%
Common
ValueCountFrequency (%)
373058
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3544051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 559587
15.8%
a 373058
10.5%
373058
10.5%
u 373058
10.5%
Y 186529
 
5.3%
l 186529
 
5.3%
P 186529
 
5.3%
b 186529
 
5.3%
o 186529
 
5.3%
d 186529
 
5.3%
Other values (4) 746116
21.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:03.944088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters186529
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%
2025-01-14T11:27:04.040128image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 186529
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 186529
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186529
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:04.079347image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters559587
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYPM
2nd rowYPM
3rd rowYPM
4th rowYPM
5th rowYPM
ValueCountFrequency (%)
ypm 186529
100.0%
2025-01-14T11:27:04.173839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 559587
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 559587
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 559587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:04.214536image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.205882195
Min length2

Characters and Unicode

Total characters411461
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYU
2nd rowCBS
3rd rowYU
4th rowYU
5th rowYU
ValueCountFrequency (%)
yu 148126
79.4%
cbs 38403
 
20.6%
2025-01-14T11:27:04.313101image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 411461
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 411461
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 411461
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

ownerInstitutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:04.353494image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters559587
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYPM
2nd rowYPM
3rd rowYPM
4th rowYPM
5th rowYPM
ValueCountFrequency (%)
ypm 186529
100.0%
2025-01-14T11:27:04.448567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 559587
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 559587
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 559587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:04.494088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters3170993
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 186529
100.0%
2025-01-14T11:27:04.597801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 932645
29.4%
r 373058
 
11.8%
P 186529
 
5.9%
s 186529
 
5.9%
v 186529
 
5.9%
d 186529
 
5.9%
S 186529
 
5.9%
p 186529
 
5.9%
c 186529
 
5.9%
i 186529
 
5.9%
Other values (2) 373058
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2797935
88.2%
Uppercase Letter 373058
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 932645
33.3%
r 373058
 
13.3%
s 186529
 
6.7%
v 186529
 
6.7%
d 186529
 
6.7%
p 186529
 
6.7%
c 186529
 
6.7%
i 186529
 
6.7%
m 186529
 
6.7%
n 186529
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
P 186529
50.0%
S 186529
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3170993
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 932645
29.4%
r 373058
 
11.8%
P 186529
 
5.9%
s 186529
 
5.9%
v 186529
 
5.9%
d 186529
 
5.9%
S 186529
 
5.9%
p 186529
 
5.9%
c 186529
 
5.9%
i 186529
 
5.9%
Other values (2) 373058
 
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3170993
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 932645
29.4%
r 373058
 
11.8%
P 186529
 
5.9%
s 186529
 
5.9%
v 186529
 
5.9%
d 186529
 
5.9%
S 186529
 
5.9%
p 186529
 
5.9%
c 186529
 
5.9%
i 186529
 
5.9%
Other values (2) 373058
 
11.8%

dynamicProperties
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:04.941605image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length468
Median length364
Mean length129.8176048
Min length20

Characters and Unicode

Total characters24214748
Distinct characters44
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st row{ "irn": "1284160", "media": "1049200:23e7a3e4-d0b0-4e83-9ff2-192065f61a5a", "mm_repository_id": "1049200" }
2nd row{ "irn": "1377942", "media": "109412:2de8b571-4db4-4d56-aca6-faa3477edb7c", "mm_repository_id": "109412", "solr_long_lat": "-72.2664,41.4854" }
3rd row{ "irn": "908073", "solr_long_lat": "-72.9316,41.4070" }
4th row{ "irn": "1892063", "media": "268631:3adf8b86-2732-45cd-aef6-c1ead71bd726", "mm_repository_id": "268631", "solr_long_lat": "-119,51" }
5th row{ "irn": "2463858", "media": "1186778:f2d4000d-7289-44d9-bba3-f87582cd4f33 1186779:5b8ba8d4-ba11-4789-b865-bf0d163e1e42", "mm_repository_id": "1186778" }
ValueCountFrequency (%)
373805
22.1%
irn 186529
 
11.0%
mm_repository_id 177182
 
10.5%
media 177182
 
10.5%
solr_long_lat 104429
 
6.2%
72.9316,41.4070 1988
 
0.1%
72.920823,41.305111 1951
 
0.1%
72.9247,41.3114 1870
 
0.1%
72.88,41.6050 1661
 
0.1%
73.036,41.5583 1211
 
0.1%
Other values (569062) 662556
39.2%
2025-01-14T11:27:05.368263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 2587264
 
10.7%
1503835
 
6.2%
1 1306746
 
5.4%
4 1080313
 
4.5%
2 1005309
 
4.2%
- 909018
 
3.8%
9 877072
 
3.6%
8 856889
 
3.5%
3 854143
 
3.5%
7 850393
 
3.5%
Other values (34) 12383766
51.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9188047
37.9%
Lowercase Letter 7460825
30.8%
Other Punctuation 4207683
17.4%
Space Separator 1503835
 
6.2%
Dash Punctuation 909018
 
3.8%
Connector Punctuation 566210
 
2.3%
Open Punctuation 186529
 
0.8%
Close Punctuation 186529
 
0.8%
Uppercase Letter 5686
 
< 0.1%
Math Symbol 386
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 736138
9.9%
d 734755
9.8%
i 718822
9.6%
a 709722
9.5%
r 649804
 
8.7%
o 564716
 
7.6%
m 531546
 
7.1%
b 425696
 
5.7%
c 377479
 
5.1%
f 376923
 
5.1%
Other values (8) 1635224
21.9%
Decimal Number
ValueCountFrequency (%)
1 1306746
14.2%
4 1080313
11.8%
2 1005309
10.9%
9 877072
9.5%
8 856889
9.3%
3 854143
9.3%
7 850393
9.3%
6 823084
9.0%
0 786476
8.6%
5 747622
8.1%
Uppercase Letter
ValueCountFrequency (%)
Y 2266
39.9%
P 1140
20.0%
M 1140
20.0%
U 1126
19.8%
A 7
 
0.1%
R 7
 
0.1%
Other Punctuation
ValueCountFrequency (%)
" 2587264
61.5%
: 847672
 
20.1%
, 564716
 
13.4%
. 208031
 
4.9%
Space Separator
ValueCountFrequency (%)
1503835
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 909018
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 566210
100.0%
Open Punctuation
ValueCountFrequency (%)
{ 186529
100.0%
Close Punctuation
ValueCountFrequency (%)
} 186529
100.0%
Math Symbol
ValueCountFrequency (%)
| 386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16748237
69.2%
Latin 7466511
30.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 736138
9.9%
d 734755
9.8%
i 718822
9.6%
a 709722
9.5%
r 649804
 
8.7%
o 564716
 
7.6%
m 531546
 
7.1%
b 425696
 
5.7%
c 377479
 
5.1%
f 376923
 
5.0%
Other values (14) 1640910
22.0%
Common
ValueCountFrequency (%)
" 2587264
15.4%
1503835
 
9.0%
1 1306746
 
7.8%
4 1080313
 
6.5%
2 1005309
 
6.0%
- 909018
 
5.4%
9 877072
 
5.2%
8 856889
 
5.1%
3 854143
 
5.1%
7 850393
 
5.1%
Other values (10) 4917255
29.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24214748
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 2587264
 
10.7%
1503835
 
6.2%
1 1306746
 
5.4%
4 1080313
 
4.5%
2 1005309
 
4.2%
- 909018
 
3.8%
9 877072
 
3.6%
8 856889
 
3.5%
3 854143
 
3.5%
7 850393
 
3.5%
Other values (34) 12383766
51.1%

occurrenceID
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:05.519924image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters8393805
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowurn:uuid:a15cbeaa-3fcd-4ec5-bfb1-27f0f8bc8910
2nd rowurn:uuid:a15e0d7e-5095-4a84-b02b-fe689f416389
3rd rowurn:uuid:a165d6f6-a6f1-4464-9d19-d307fba92359
4th rowurn:uuid:a1674501-cb24-4a3a-9ef8-4d0751ad4e63
5th rowurn:uuid:a169b221-8413-44a8-bccc-fa7045bf79df
ValueCountFrequency (%)
urn:uuid:a15cbeaa-3fcd-4ec5-bfb1-27f0f8bc8910 1
 
< 0.1%
urn:uuid:a1d1e7f6-c3fd-4cdf-92eb-181c3735610c 1
 
< 0.1%
urn:uuid:a19015bb-6550-4f6a-afda-a2f1f7015626 1
 
< 0.1%
urn:uuid:a276dcf5-b6fd-4a0e-a9c9-e3d67d274f2c 1
 
< 0.1%
urn:uuid:a18d197d-f3bd-4416-bdef-f4a9f2135f3e 1
 
< 0.1%
urn:uuid:a165d6f6-a6f1-4464-9d19-d307fba92359 1
 
< 0.1%
urn:uuid:a1674501-cb24-4a3a-9ef8-4d0751ad4e63 1
 
< 0.1%
urn:uuid:a169b221-8413-44a8-bccc-fa7045bf79df 1
 
< 0.1%
urn:uuid:a16fdf5e-d4db-44ab-8f13-95359c948f0c 1
 
< 0.1%
urn:uuid:a17057c5-3a20-44d8-b8bb-a4febbcf747a 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-14T11:27:05.735677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 746116
 
8.9%
u 559587
 
6.7%
d 536327
 
6.4%
4 535641
 
6.4%
8 397325
 
4.7%
a 396721
 
4.7%
b 396278
 
4.7%
9 396248
 
4.7%
: 373058
 
4.4%
c 350587
 
4.2%
Other values (12) 3705917
44.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3777062
45.0%
Lowercase Letter 3497569
41.7%
Dash Punctuation 746116
 
8.9%
Other Punctuation 373058
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 559587
16.0%
d 536327
15.3%
a 396721
11.3%
b 396278
11.3%
c 350587
10.0%
f 349437
10.0%
e 349045
10.0%
r 186529
 
5.3%
i 186529
 
5.3%
n 186529
 
5.3%
Decimal Number
ValueCountFrequency (%)
4 535641
14.2%
8 397325
10.5%
9 396248
10.5%
1 350371
9.3%
6 349908
9.3%
7 349825
9.3%
5 349699
9.3%
3 349432
9.3%
0 349369
9.2%
2 349244
9.2%
Dash Punctuation
ValueCountFrequency (%)
- 746116
100.0%
Other Punctuation
ValueCountFrequency (%)
: 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4896236
58.3%
Latin 3497569
41.7%

Most frequent character per script

Common
ValueCountFrequency (%)
- 746116
15.2%
4 535641
10.9%
8 397325
8.1%
9 396248
8.1%
: 373058
7.6%
1 350371
7.2%
6 349908
7.1%
7 349825
7.1%
5 349699
7.1%
3 349432
7.1%
Other values (2) 698613
14.3%
Latin
ValueCountFrequency (%)
u 559587
16.0%
d 536327
15.3%
a 396721
11.3%
b 396278
11.3%
c 350587
10.0%
f 349437
10.0%
e 349045
10.0%
r 186529
 
5.3%
i 186529
 
5.3%
n 186529
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8393805
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 746116
 
8.9%
u 559587
 
6.7%
d 536327
 
6.4%
4 535641
 
6.4%
8 397325
 
4.7%
a 396721
 
4.7%
b 396278
 
4.7%
9 396248
 
4.7%
: 373058
 
4.4%
c 350587
 
4.2%
Other values (12) 3705917
44.2%

catalogNumber
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:05.997369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9
Mean length9.206482638
Min length9

Characters and Unicode

Total characters1717276
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowYU.036650
2nd rowCBS.028950
3rd rowYU.070008
4th rowYU.204399
5th rowYU.175465
ValueCountFrequency (%)
yu.036650 1
 
< 0.1%
yu.065082 1
 
< 0.1%
yu.065678 1
 
< 0.1%
yu.234842 1
 
< 0.1%
yu.012442 1
 
< 0.1%
yu.070008 1
 
< 0.1%
yu.204399 1
 
< 0.1%
yu.175465 1
 
< 0.1%
yu.060443 1
 
< 0.1%
yu.038995 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-14T11:27:06.320916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 223252
13.0%
. 186557
10.9%
2 150900
8.8%
Y 148126
8.6%
U 148126
8.6%
1 131446
 
7.7%
3 102608
 
6.0%
4 92196
 
5.4%
5 86628
 
5.0%
8 85549
 
5.0%
Other values (6) 361888
21.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1119258
65.2%
Uppercase Letter 411461
 
24.0%
Other Punctuation 186557
 
10.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%
Other Punctuation
ValueCountFrequency (%)
. 186557
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1305815
76.0%
Latin 411461
 
24.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 223252
17.1%
. 186557
14.3%
2 150900
11.6%
1 131446
10.1%
3 102608
7.9%
4 92196
7.1%
5 86628
 
6.6%
8 85549
 
6.6%
7 83994
 
6.4%
6 83823
 
6.4%
Latin
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1717276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 223252
13.0%
. 186557
10.9%
2 150900
8.8%
Y 148126
8.6%
U 148126
8.6%
1 131446
 
7.7%
3 102608
 
6.0%
4 92196
 
5.4%
5 86628
 
5.0%
8 85549
 
5.0%
Other values (6) 361888
21.1%

recordNumber
Text

Missing 

Distinct13601
Distinct (%)28.6%
Missing139017
Missing (%)74.5%
Memory size1.4 MiB
2025-01-14T11:27:06.525159image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length20
Mean length3.446729247
Min length1

Characters and Unicode

Total characters163761
Distinct characters77
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7343 ?
Unique (%)15.5%

Sample

1st row4856
2nd row621
3rd row12
4th row545
5th row4616
ValueCountFrequency (%)
2 265
 
0.5%
1 234
 
0.5%
3 209
 
0.4%
4 207
 
0.4%
8 177
 
0.4%
6 176
 
0.4%
5 171
 
0.4%
7 163
 
0.3%
9 156
 
0.3%
10 150
 
0.3%
Other values (12986) 46388
96.0%
2025-01-14T11:27:06.796228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 25684
15.7%
2 19997
12.2%
3 17169
10.5%
4 15274
9.3%
5 14907
9.1%
6 13292
8.1%
7 12976
7.9%
8 12576
7.7%
0 12513
7.6%
9 12410
7.6%
Other values (67) 6963
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156798
95.7%
Lowercase Letter 2347
 
1.4%
Uppercase Letter 1734
 
1.1%
Other Punctuation 1266
 
0.8%
Space Separator 784
 
0.5%
Dash Punctuation 744
 
0.5%
Math Symbol 44
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 795
33.9%
p 631
26.9%
b 355
15.1%
c 92
 
3.9%
d 77
 
3.3%
u 63
 
2.7%
n 62
 
2.6%
e 49
 
2.1%
o 32
 
1.4%
r 26
 
1.1%
Other values (15) 165
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
S 188
10.8%
D 177
10.2%
X 156
 
9.0%
B 139
 
8.0%
I 130
 
7.5%
A 124
 
7.2%
P 115
 
6.6%
C 114
 
6.6%
E 104
 
6.0%
W 75
 
4.3%
Other values (15) 412
23.8%
Decimal Number
ValueCountFrequency (%)
1 25684
16.4%
2 19997
12.8%
3 17169
10.9%
4 15274
9.7%
5 14907
9.5%
6 13292
8.5%
7 12976
8.3%
8 12576
8.0%
0 12513
8.0%
9 12410
7.9%
Other Punctuation
ValueCountFrequency (%)
. 834
65.9%
/ 221
 
17.5%
, 138
 
10.9%
# 35
 
2.8%
& 16
 
1.3%
: 10
 
0.8%
? 6
 
0.5%
' 5
 
0.4%
; 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 22
50.0%
+ 22
50.0%
Open Punctuation
ValueCountFrequency (%)
( 20
90.9%
[ 2
 
9.1%
Close Punctuation
ValueCountFrequency (%)
) 20
90.9%
] 2
 
9.1%
Space Separator
ValueCountFrequency (%)
784
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 744
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 159680
97.5%
Latin 4081
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 795
19.5%
p 631
15.5%
b 355
 
8.7%
S 188
 
4.6%
D 177
 
4.3%
X 156
 
3.8%
B 139
 
3.4%
I 130
 
3.2%
A 124
 
3.0%
P 115
 
2.8%
Other values (40) 1271
31.1%
Common
ValueCountFrequency (%)
1 25684
16.1%
2 19997
12.5%
3 17169
10.8%
4 15274
9.6%
5 14907
9.3%
6 13292
8.3%
7 12976
8.1%
8 12576
7.9%
0 12513
7.8%
9 12410
7.8%
Other values (17) 2882
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 163761
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 25684
15.7%
2 19997
12.2%
3 17169
10.5%
4 15274
9.3%
5 14907
9.1%
6 13292
8.1%
7 12976
7.9%
8 12576
7.7%
0 12513
7.6%
9 12410
7.6%
Other values (67) 6963
 
4.3%

recordedBy
Text

Missing 

Distinct3451
Distinct (%)3.1%
Missing75764
Missing (%)40.6%
Memory size1.4 MiB
2025-01-14T11:27:06.987916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length94
Mean length16.95773033
Min length2

Characters and Unicode

Total characters1878323
Distinct characters80
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1506 ?
Unique (%)1.4%

Sample

1st rowCharles H. Bissell
2nd rowHoratio N. Fenn
3rd rowAlfred H. Brinkman
4th rowCharles C. Godfrey
5th rowCharles H. Bissell
ValueCountFrequency (%)
h 17884
 
5.3%
charles 16797
 
5.0%
w 13815
 
4.1%
e 13699
 
4.1%
a 9233
 
2.8%
george 9101
 
2.7%
bissell 8948
 
2.7%
c 7711
 
2.3%
nichols 6625
 
2.0%
b 6460
 
1.9%
Other values (2822) 225265
67.1%
2025-01-14T11:27:07.257725image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
224773
 
12.0%
e 165937
 
8.8%
r 129288
 
6.9%
a 119892
 
6.4%
l 112246
 
6.0%
. 107354
 
5.7%
n 97287
 
5.2%
s 80163
 
4.3%
i 77500
 
4.1%
o 75225
 
4.0%
Other values (70) 688658
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1200369
63.9%
Uppercase Letter 335975
 
17.9%
Space Separator 224773
 
12.0%
Other Punctuation 116139
 
6.2%
Decimal Number 799
 
< 0.1%
Close Punctuation 96
 
< 0.1%
Open Punctuation 96
 
< 0.1%
Dash Punctuation 76
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 165937
13.8%
r 129288
10.8%
a 119892
10.0%
l 112246
9.4%
n 97287
8.1%
s 80163
 
6.7%
i 77500
 
6.5%
o 75225
 
6.3%
h 59764
 
5.0%
t 52406
 
4.4%
Other values (21) 230661
19.2%
Uppercase Letter
ValueCountFrequency (%)
C 36558
10.9%
E 36255
10.8%
H 33567
10.0%
A 31547
9.4%
W 29410
 
8.8%
B 28312
 
8.4%
S 17816
 
5.3%
G 17569
 
5.2%
J 15225
 
4.5%
L 13747
 
4.1%
Other values (17) 75969
22.6%
Decimal Number
ValueCountFrequency (%)
1 400
50.1%
9 184
23.0%
4 107
 
13.4%
8 59
 
7.4%
3 20
 
2.5%
2 18
 
2.3%
5 6
 
0.8%
7 3
 
0.4%
6 2
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 107354
92.4%
, 4807
 
4.1%
; 3914
 
3.4%
' 53
 
< 0.1%
? 5
 
< 0.1%
& 4
 
< 0.1%
/ 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 91
94.8%
] 5
 
5.2%
Open Punctuation
ValueCountFrequency (%)
( 91
94.8%
[ 5
 
5.2%
Space Separator
ValueCountFrequency (%)
224773
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 76
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1536344
81.8%
Common 341979
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 165937
 
10.8%
r 129288
 
8.4%
a 119892
 
7.8%
l 112246
 
7.3%
n 97287
 
6.3%
s 80163
 
5.2%
i 77500
 
5.0%
o 75225
 
4.9%
h 59764
 
3.9%
t 52406
 
3.4%
Other values (48) 566636
36.9%
Common
ValueCountFrequency (%)
224773
65.7%
. 107354
31.4%
, 4807
 
1.4%
; 3914
 
1.1%
1 400
 
0.1%
9 184
 
0.1%
4 107
 
< 0.1%
) 91
 
< 0.1%
( 91
 
< 0.1%
- 76
 
< 0.1%
Other values (12) 182
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1878190
> 99.9%
None 133
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
224773
 
12.0%
e 165937
 
8.8%
r 129288
 
6.9%
a 119892
 
6.4%
l 112246
 
6.0%
. 107354
 
5.7%
n 97287
 
5.2%
s 80163
 
4.3%
i 77500
 
4.1%
o 75225
 
4.0%
Other values (64) 688525
36.7%
None
ValueCountFrequency (%)
á 122
91.7%
ö 4
 
3.0%
ô 4
 
3.0%
è 1
 
0.8%
É 1
 
0.8%
ä 1
 
0.8%

individualCount
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:07.313765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters186529
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 186529
100.0%
2025-01-14T11:27:07.410675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 186529
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 186529
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 186529
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186529
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 186529
100.0%

reproductiveCondition
Text

Missing 

Distinct4
Distinct (%)16.0%
Missing186504
Missing (%)> 99.9%
Memory size1.4 MiB
2025-01-14T11:27:07.456972image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length9
Mean length10.28
Min length8

Characters and Unicode

Total characters257
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)4.0%

Sample

1st rowFlowering
2nd rowFlowering
3rd rowFlowering
4th rowFlowering & Fruiting.
5th rowFruiting
ValueCountFrequency (%)
flowering 20
62.5%
fruiting 6
 
18.8%
2
 
6.2%
male 1
 
3.1%
and 1
 
3.1%
female 1
 
3.1%
cones 1
 
3.1%
2025-01-14T11:27:07.561030image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 32
12.5%
n 28
10.9%
F 26
10.1%
r 26
10.1%
g 26
10.1%
e 24
9.3%
l 22
8.6%
o 21
8.2%
w 20
7.8%
7
 
2.7%
Other values (10) 25
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 220
85.6%
Uppercase Letter 26
 
10.1%
Space Separator 7
 
2.7%
Other Punctuation 4
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 32
14.5%
n 28
12.7%
r 26
11.8%
g 26
11.8%
e 24
10.9%
l 22
10.0%
o 21
9.5%
w 20
9.1%
t 6
 
2.7%
u 6
 
2.7%
Other values (6) 9
 
4.1%
Other Punctuation
ValueCountFrequency (%)
& 2
50.0%
. 2
50.0%
Uppercase Letter
ValueCountFrequency (%)
F 26
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 246
95.7%
Common 11
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 32
13.0%
n 28
11.4%
F 26
10.6%
r 26
10.6%
g 26
10.6%
e 24
9.8%
l 22
8.9%
o 21
8.5%
w 20
8.1%
t 6
 
2.4%
Other values (7) 15
6.1%
Common
ValueCountFrequency (%)
7
63.6%
& 2
 
18.2%
. 2
 
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 32
12.5%
n 28
10.9%
F 26
10.1%
r 26
10.1%
g 26
10.1%
e 24
9.3%
l 22
8.6%
o 21
8.2%
w 20
7.8%
7
 
2.7%
Other values (10) 25
9.7%

preparations
Text

Constant  Missing 

Distinct1
Distinct (%)1.9%
Missing186476
Missing (%)> 99.9%
Memory size1.4 MiB
2025-01-14T11:27:07.604554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters795
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtissue (frozen)
2nd rowtissue (frozen)
3rd rowtissue (frozen)
4th rowtissue (frozen)
5th rowtissue (frozen)
ValueCountFrequency (%)
tissue 53
50.0%
frozen 53
50.0%
2025-01-14T11:27:07.698357image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 106
13.3%
e 106
13.3%
t 53
 
6.7%
i 53
 
6.7%
u 53
 
6.7%
53
 
6.7%
( 53
 
6.7%
f 53
 
6.7%
r 53
 
6.7%
o 53
 
6.7%
Other values (3) 159
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 636
80.0%
Space Separator 53
 
6.7%
Open Punctuation 53
 
6.7%
Close Punctuation 53
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 106
16.7%
e 106
16.7%
t 53
8.3%
i 53
8.3%
u 53
8.3%
f 53
8.3%
r 53
8.3%
o 53
8.3%
z 53
8.3%
n 53
8.3%
Space Separator
ValueCountFrequency (%)
53
100.0%
Open Punctuation
ValueCountFrequency (%)
( 53
100.0%
Close Punctuation
ValueCountFrequency (%)
) 53
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 636
80.0%
Common 159
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 106
16.7%
e 106
16.7%
t 53
8.3%
i 53
8.3%
u 53
8.3%
f 53
8.3%
r 53
8.3%
o 53
8.3%
z 53
8.3%
n 53
8.3%
Common
ValueCountFrequency (%)
53
33.3%
( 53
33.3%
) 53
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 106
13.3%
e 106
13.3%
t 53
 
6.7%
i 53
 
6.7%
u 53
 
6.7%
53
 
6.7%
( 53
 
6.7%
f 53
 
6.7%
r 53
 
6.7%
o 53
 
6.7%
Other values (3) 159
20.0%

disposition
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:07.744159image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters2424877
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowin collection
2nd rowin collection
3rd rowin collection
4th rowin collection
5th rowin collection
ValueCountFrequency (%)
in 186529
50.0%
collection 186529
50.0%
2025-01-14T11:27:07.842252image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 373058
15.4%
n 373058
15.4%
c 373058
15.4%
o 373058
15.4%
l 373058
15.4%
186529
7.7%
e 186529
7.7%
t 186529
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2238348
92.3%
Space Separator 186529
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 373058
16.7%
n 373058
16.7%
c 373058
16.7%
o 373058
16.7%
l 373058
16.7%
e 186529
8.3%
t 186529
8.3%
Space Separator
ValueCountFrequency (%)
186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2238348
92.3%
Common 186529
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 373058
16.7%
n 373058
16.7%
c 373058
16.7%
o 373058
16.7%
l 373058
16.7%
e 186529
8.3%
t 186529
8.3%
Common
ValueCountFrequency (%)
186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2424877
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 373058
15.4%
n 373058
15.4%
c 373058
15.4%
o 373058
15.4%
l 373058
15.4%
186529
7.7%
e 186529
7.7%
t 186529
7.7%

associatedMedia
Text

Missing 

Distinct177182
Distinct (%)100.0%
Missing9347
Missing (%)5.0%
Memory size1.4 MiB
2025-01-14T11:27:08.006838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length113
Median length113
Mean length111.7975076
Min length107

Characters and Unicode

Total characters19808506
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177182 ?
Unique (%)100.0%

Sample

1st rowhttps://images.collections.yale.edu/iiif/2/ypm:23e7a3e4-d0b0-4e83-9ff2-192065f61a5a/full/!1920,1920/0/default.jpg
2nd rowhttps://images.collections.yale.edu/iiif/2/ypm:2de8b571-4db4-4d56-aca6-faa3477edb7c/full/!1920,1920/0/default.jpg
3rd rowhttps://images.collections.yale.edu/iiif/2/ypm:3adf8b86-2732-45cd-aef6-c1ead71bd726/full/full/0/default.jpg
4th rowhttps://images.collections.yale.edu/iiif/2/ypm:f2d4000d-7289-44d9-bba3-f87582cd4f33/full/!1920,1920/0/default.jpg
5th rowhttps://images.collections.yale.edu/iiif/2/ypm:ecfc1bbe-adbf-4999-9263-99dabd159dcc/full/!1920,1920/0/default.jpg
ValueCountFrequency (%)
https://images.collections.yale.edu/iiif/2/ypm:23e7a3e4-d0b0-4e83-9ff2-192065f61a5a/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:6229c87b-4849-496d-a22f-b765c2a42861/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:bba84f78-a996-4f2d-934f-eefcbbce068a/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:5ce0c16d-e619-4489-b129-7558d6de5f72/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:3adf8b86-2732-45cd-aef6-c1ead71bd726/full/full/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:f2d4000d-7289-44d9-bba3-f87582cd4f33/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:ecfc1bbe-adbf-4999-9263-99dabd159dcc/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:2e6408d2-310d-4eb9-8f02-5c0755d9334c/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:05fe5091-3c69-49f4-bd1b-72d1e8ab0c24/full/!1920,1920/0/default.jpg 1
 
< 0.1%
https://images.collections.yale.edu/iiif/2/ypm:fbd4a159-7a17-416c-8c71-7db8a3c69968/full/!1920,1920/0/default.jpg 1
 
< 0.1%
Other values (177172) 177172
> 99.9%
2025-01-14T11:27:08.247895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 1594638
 
8.1%
e 1217933
 
6.1%
l 1134112
 
5.7%
a 907729
 
4.6%
f 899939
 
4.5%
i 885910
 
4.5%
0 793299
 
4.0%
2 792595
 
4.0%
- 708728
 
3.6%
. 708728
 
3.6%
Other values (25) 10164895
51.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11082612
55.9%
Decimal Number 5076092
25.6%
Other Punctuation 2941074
 
14.8%
Dash Punctuation 708728
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1217933
11.0%
l 1134112
 
10.2%
a 907729
 
8.2%
f 899939
 
8.1%
i 885910
 
8.0%
t 708728
 
6.4%
d 687536
 
6.2%
c 685902
 
6.2%
u 567056
 
5.1%
s 531546
 
4.8%
Other values (9) 2856221
25.8%
Decimal Number
ValueCountFrequency (%)
0 793299
15.6%
2 792595
15.6%
9 659348
13.0%
1 615316
12.1%
4 509545
10.0%
8 376634
7.4%
7 332887
6.6%
5 332244
6.5%
3 332114
6.5%
6 332110
6.5%
Other Punctuation
ValueCountFrequency (%)
/ 1594638
54.2%
. 708728
24.1%
: 354364
 
12.0%
! 141672
 
4.8%
, 141672
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 708728
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11082612
55.9%
Common 8725894
44.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1217933
11.0%
l 1134112
 
10.2%
a 907729
 
8.2%
f 899939
 
8.1%
i 885910
 
8.0%
t 708728
 
6.4%
d 687536
 
6.2%
c 685902
 
6.2%
u 567056
 
5.1%
s 531546
 
4.8%
Other values (9) 2856221
25.8%
Common
ValueCountFrequency (%)
/ 1594638
18.3%
0 793299
9.1%
2 792595
9.1%
- 708728
8.1%
. 708728
8.1%
9 659348
7.6%
1 615316
 
7.1%
4 509545
 
5.8%
8 376634
 
4.3%
: 354364
 
4.1%
Other values (6) 1612699
18.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19808506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 1594638
 
8.1%
e 1217933
 
6.1%
l 1134112
 
5.7%
a 907729
 
4.6%
f 899939
 
4.5%
i 885910
 
4.5%
0 793299
 
4.0%
2 792595
 
4.0%
- 708728
 
3.6%
. 708728
 
3.6%
Other values (25) 10164895
51.3%

associatedReferences
Text

Missing 

Distinct3765
Distinct (%)37.4%
Missing176462
Missing (%)94.6%
Memory size1.4 MiB
2025-01-14T11:27:08.435128image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length481
Median length338
Mean length43.67040826
Min length1

Characters and Unicode

Total characters439630
Distinct characters93
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3122 ?
Unique (%)31.0%

Sample

1st rowDet. by: Martin C. Van Boskirk 1997|
2nd rowDet. by: Alexander W. Evans
3rd rowISOTYPE. Note: Proc. Amer. Acad. Arts. 22: 420. 1887.
4th rowISOSYNTYPE. Note: Mem. Amer. Acad. Arts. n.s. 520. 1862.
5th rowISOTYPE. Note: Pl. Wright. (Grisebach) 1: 173. 1860.
ValueCountFrequency (%)
by 6513
 
8.7%
det 6278
 
8.4%
note 4081
 
5.5%
isotype 2637
 
3.5%
of 1965
 
2.6%
w 1081
 
1.4%
the 1033
 
1.4%
syntype 884
 
1.2%
arts 839
 
1.1%
amer 784
 
1.0%
Other values (2983) 48652
65.1%
2025-01-14T11:27:08.691621image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
64680
 
14.7%
. 32693
 
7.4%
e 30865
 
7.0%
t 21244
 
4.8%
o 17185
 
3.9%
a 16258
 
3.7%
r 16177
 
3.7%
: 13805
 
3.1%
n 13259
 
3.0%
i 11346
 
2.6%
Other values (83) 202118
46.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 202214
46.0%
Uppercase Letter 77380
 
17.6%
Space Separator 64680
 
14.7%
Other Punctuation 47613
 
10.8%
Decimal Number 41766
 
9.5%
Math Symbol 4292
 
1.0%
Dash Punctuation 591
 
0.1%
Close Punctuation 546
 
0.1%
Open Punctuation 546
 
0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 30865
15.3%
t 21244
10.5%
o 17185
 
8.5%
a 16258
 
8.0%
r 16177
 
8.0%
n 13259
 
6.6%
i 11346
 
5.6%
l 10399
 
5.1%
y 8787
 
4.3%
s 8678
 
4.3%
Other values (24) 48016
23.7%
Uppercase Letter
ValueCountFrequency (%)
D 6983
 
9.0%
P 6970
 
9.0%
E 6536
 
8.4%
S 6410
 
8.3%
A 6193
 
8.0%
N 6019
 
7.8%
Y 5458
 
7.1%
T 5403
 
7.0%
C 3719
 
4.8%
O 3662
 
4.7%
Other values (16) 20027
25.9%
Other Punctuation
ValueCountFrequency (%)
. 32693
68.7%
: 13805
29.0%
; 451
 
0.9%
, 420
 
0.9%
' 92
 
0.2%
? 81
 
0.2%
& 36
 
0.1%
" 27
 
0.1%
# 6
 
< 0.1%
/ 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 10167
24.3%
8 6112
14.6%
9 4977
11.9%
6 3913
 
9.4%
2 3487
 
8.3%
7 3041
 
7.3%
5 3040
 
7.3%
4 2459
 
5.9%
3 2393
 
5.7%
0 2177
 
5.2%
Math Symbol
ValueCountFrequency (%)
| 4244
98.9%
= 44
 
1.0%
+ 4
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 380
69.6%
] 164
30.0%
} 2
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 379
69.4%
[ 165
30.2%
{ 2
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 563
95.3%
28
 
4.7%
Space Separator
ValueCountFrequency (%)
64680
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 279594
63.6%
Common 160036
36.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 30865
 
11.0%
t 21244
 
7.6%
o 17185
 
6.1%
a 16258
 
5.8%
r 16177
 
5.8%
n 13259
 
4.7%
i 11346
 
4.1%
l 10399
 
3.7%
y 8787
 
3.1%
s 8678
 
3.1%
Other values (50) 125396
44.8%
Common
ValueCountFrequency (%)
64680
40.4%
. 32693
20.4%
: 13805
 
8.6%
1 10167
 
6.4%
8 6112
 
3.8%
9 4977
 
3.1%
| 4244
 
2.7%
6 3913
 
2.4%
2 3487
 
2.2%
7 3041
 
1.9%
Other values (23) 12917
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 439412
> 99.9%
None 190
 
< 0.1%
Punctuation 28
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
64680
 
14.7%
. 32693
 
7.4%
e 30865
 
7.0%
t 21244
 
4.8%
o 17185
 
3.9%
a 16258
 
3.7%
r 16177
 
3.7%
: 13805
 
3.1%
n 13259
 
3.0%
i 11346
 
2.6%
Other values (73) 201900
45.9%
None
ValueCountFrequency (%)
á 125
65.8%
ü 26
 
13.7%
é 23
 
12.1%
ö 8
 
4.2%
ä 2
 
1.1%
è 2
 
1.1%
° 2
 
1.1%
ë 1
 
0.5%
ñ 1
 
0.5%
Punctuation
ValueCountFrequency (%)
28
100.0%

associatedTaxa
Text

Missing 

Distinct745
Distinct (%)99.7%
Missing185782
Missing (%)99.6%
Memory size1.4 MiB
2025-01-14T11:27:08.867145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length109
Median length21
Mean length29.93574297
Min length9

Characters and Unicode

Total characters22362
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique743 ?
Unique (%)99.5%

Sample

1st rowsame sheet: YU.064497|same sheet: YU.064498|same sheet: YU.064500
2nd rowsame sheet: YU.064978
3rd rowYU.000992
4th rowsame sheet: YU.064670
5th rowsame sheet: YU.001167
ValueCountFrequency (%)
sheet 965
35.8%
same 649
24.1%
replicate 9
 
0.3%
yu.065496|same 5
 
0.2%
yu.014017|same 5
 
0.2%
yu.014019|same 5
 
0.2%
yu.014020|same 5
 
0.2%
yu.014022 5
 
0.2%
yu.065492 5
 
0.2%
yu.065494|same 5
 
0.2%
Other values (832) 1037
38.5%
2025-01-14T11:27:09.106508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2925
13.1%
1948
 
8.7%
s 1930
 
8.6%
0 1853
 
8.3%
6 1270
 
5.7%
. 1134
 
5.1%
Y 1133
 
5.1%
U 1126
 
5.0%
t 983
 
4.4%
: 983
 
4.4%
Other values (22) 7077
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8823
39.5%
Decimal Number 6801
30.4%
Uppercase Letter 2287
 
10.2%
Other Punctuation 2117
 
9.5%
Space Separator 1948
 
8.7%
Math Symbol 386
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2925
33.2%
s 1930
21.9%
t 983
 
11.1%
a 977
 
11.1%
h 971
 
11.0%
m 965
 
10.9%
r 18
 
0.2%
p 12
 
0.1%
c 12
 
0.1%
i 12
 
0.1%
Other values (2) 18
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 1853
27.2%
6 1270
18.7%
5 693
 
10.2%
1 596
 
8.8%
4 587
 
8.6%
2 450
 
6.6%
9 375
 
5.5%
7 364
 
5.4%
3 322
 
4.7%
8 291
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
Y 1133
49.5%
U 1126
49.2%
A 7
 
0.3%
P 7
 
0.3%
R 7
 
0.3%
M 7
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 1134
53.6%
: 983
46.4%
Space Separator
ValueCountFrequency (%)
1948
100.0%
Math Symbol
ValueCountFrequency (%)
| 386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11252
50.3%
Latin 11110
49.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2925
26.3%
s 1930
17.4%
Y 1133
 
10.2%
U 1126
 
10.1%
t 983
 
8.8%
a 977
 
8.8%
h 971
 
8.7%
m 965
 
8.7%
r 18
 
0.2%
p 12
 
0.1%
Other values (8) 70
 
0.6%
Common
ValueCountFrequency (%)
1948
17.3%
0 1853
16.5%
6 1270
11.3%
. 1134
10.1%
: 983
8.7%
5 693
 
6.2%
1 596
 
5.3%
4 587
 
5.2%
2 450
 
4.0%
| 386
 
3.4%
Other values (4) 1352
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22362
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2925
13.1%
1948
 
8.7%
s 1930
 
8.6%
0 1853
 
8.3%
6 1270
 
5.7%
. 1134
 
5.1%
Y 1133
 
5.1%
U 1126
 
5.0%
t 983
 
4.4%
: 983
 
4.4%
Other values (22) 7077
31.6%
Distinct186516
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:09.336750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length509
Median length29
Mean length32.71075811
Min length24

Characters and Unicode

Total characters6101505
Distinct characters89
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186503 ?
Unique (%)> 99.9%

Sample

1st rowYU number 36650; lot count 1
2nd rowCBS number 28950; lot count 1
3rd rowYU number 70008; lot count 1
4th rowYU number 204399; lot count 1
5th rowYU number 175465; lot count 1
ValueCountFrequency (%)
1 186654
15.5%
number 186532
15.5%
lot 186530
15.5%
count 186529
15.5%
yu 148138
12.3%
cbs 38404
 
3.2%
tall 1591
 
0.1%
dryopteris 1419
 
0.1%
ca 1393
 
0.1%
carex 1306
 
0.1%
Other values (156716) 268264
22.2%
2025-01-14T11:27:09.635709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1020231
16.7%
o 410162
 
6.7%
t 410146
 
6.7%
n 405993
 
6.7%
u 403292
 
6.6%
1 323313
 
5.3%
e 243622
 
4.0%
r 227752
 
3.7%
l 226680
 
3.7%
; 215837
 
3.5%
Other values (79) 2214477
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3196751
52.4%
Decimal Number 1190785
 
19.5%
Space Separator 1020231
 
16.7%
Uppercase Letter 457401
 
7.5%
Other Punctuation 232829
 
3.8%
Math Symbol 2789
 
< 0.1%
Dash Punctuation 563
 
< 0.1%
Close Punctuation 64
 
< 0.1%
Open Punctuation 64
 
< 0.1%
Connector Punctuation 27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 410162
12.8%
t 410146
12.8%
n 405993
12.7%
u 403292
12.6%
e 243622
7.6%
r 227752
7.1%
l 226680
7.1%
c 212563
6.6%
m 212319
6.6%
b 193316
6.0%
Other values (18) 250906
7.8%
Uppercase Letter
ValueCountFrequency (%)
Y 151690
33.2%
U 149387
32.7%
C 43689
 
9.6%
S 40747
 
8.9%
B 39685
 
8.7%
P 6322
 
1.4%
A 4630
 
1.0%
D 3750
 
0.8%
M 3438
 
0.8%
H 2160
 
0.5%
Other values (16) 11903
 
2.6%
Other Punctuation
ValueCountFrequency (%)
; 215837
92.7%
. 11916
 
5.1%
, 3123
 
1.3%
: 1459
 
0.6%
& 336
 
0.1%
/ 76
 
< 0.1%
' 40
 
< 0.1%
" 24
 
< 0.1%
% 7
 
< 0.1%
? 7
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 323313
27.2%
2 153590
12.9%
3 104319
 
8.8%
4 93865
 
7.9%
0 89929
 
7.6%
5 89211
 
7.5%
8 86892
 
7.3%
6 85406
 
7.2%
7 84783
 
7.1%
9 79477
 
6.7%
Math Symbol
ValueCountFrequency (%)
= 2773
99.4%
~ 5
 
0.2%
< 4
 
0.1%
+ 4
 
0.1%
> 3
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 562
99.8%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 63
98.4%
] 1
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 63
98.4%
[ 1
 
1.6%
Space Separator
ValueCountFrequency (%)
1020231
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 27
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3654152
59.9%
Common 2447353
40.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 410162
11.2%
t 410146
11.2%
n 405993
11.1%
u 403292
11.0%
e 243622
 
6.7%
r 227752
 
6.2%
l 226680
 
6.2%
c 212563
 
5.8%
m 212319
 
5.8%
b 193316
 
5.3%
Other values (44) 708307
19.4%
Common
ValueCountFrequency (%)
1020231
41.7%
1 323313
 
13.2%
; 215837
 
8.8%
2 153590
 
6.3%
3 104319
 
4.3%
4 93865
 
3.8%
0 89929
 
3.7%
5 89211
 
3.6%
8 86892
 
3.6%
6 85406
 
3.5%
Other values (25) 184760
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6101443
> 99.9%
None 61
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1020231
16.7%
o 410162
 
6.7%
t 410146
 
6.7%
n 405993
 
6.7%
u 403292
 
6.6%
1 323313
 
5.3%
e 243622
 
4.0%
r 227752
 
3.7%
l 226680
 
3.7%
; 215837
 
3.5%
Other values (75) 2214415
36.3%
None
ValueCountFrequency (%)
á 30
49.2%
ñ 30
49.2%
1
 
1.6%
Punctuation
ValueCountFrequency (%)
1
100.0%
Distinct17165
Distinct (%)9.2%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-14T11:27:09.835139image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length172
Median length139
Mean length16.42291339
Min length3

Characters and Unicode

Total characters3063054
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8494 ?
Unique (%)4.6%

Sample

1st rowLuzula bulbosa
2nd rowGentiana clausa
3rd rowCarex muhlenbergii|Carex muhlenbergii
4th rowLophocolea minor
5th rowPlantae
ValueCountFrequency (%)
plantae 28374
 
8.5%
carex 8803
 
2.6%
var 4014
 
1.2%
dryopteris 2392
 
0.7%
sphagnum 2360
 
0.7%
juncus 1814
 
0.5%
frullania 1708
 
0.5%
asplenium 1557
 
0.5%
scapania 1517
 
0.5%
canadensis 1511
 
0.5%
Other values (14275) 280732
83.9%
2025-01-14T11:27:10.111490image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 399621
13.0%
i 270647
 
8.8%
e 211263
 
6.9%
l 201909
 
6.6%
r 180523
 
5.9%
n 175073
 
5.7%
u 167028
 
5.5%
o 161624
 
5.3%
s 159635
 
5.2%
t 149512
 
4.9%
Other values (49) 986219
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2714576
88.6%
Uppercase Letter 190758
 
6.2%
Space Separator 148271
 
4.8%
Other Punctuation 4477
 
0.1%
Math Symbol 4244
 
0.1%
Dash Punctuation 726
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 399621
14.7%
i 270647
10.0%
e 211263
 
7.8%
l 201909
 
7.4%
r 180523
 
6.7%
n 175073
 
6.4%
u 167028
 
6.2%
o 161624
 
6.0%
s 159635
 
5.9%
t 149512
 
5.5%
Other values (16) 637741
23.5%
Uppercase Letter
ValueCountFrequency (%)
P 49853
26.1%
C 26611
14.0%
S 17407
 
9.1%
A 14516
 
7.6%
L 11096
 
5.8%
D 7958
 
4.2%
R 7175
 
3.8%
E 7079
 
3.7%
B 6558
 
3.4%
M 6188
 
3.2%
Other values (16) 36317
19.0%
Other Punctuation
ValueCountFrequency (%)
. 4475
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
148271
100.0%
Math Symbol
ValueCountFrequency (%)
| 4244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 726
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2905334
94.9%
Common 157720
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 399621
13.8%
i 270647
 
9.3%
e 211263
 
7.3%
l 201909
 
6.9%
r 180523
 
6.2%
n 175073
 
6.0%
u 167028
 
5.7%
o 161624
 
5.6%
s 159635
 
5.5%
t 149512
 
5.1%
Other values (42) 828499
28.5%
Common
ValueCountFrequency (%)
148271
94.0%
. 4475
 
2.8%
| 4244
 
2.7%
- 726
 
0.5%
? 2
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3063054
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 399621
13.0%
i 270647
 
8.8%
e 211263
 
6.9%
l 201909
 
6.6%
r 180523
 
5.9%
n 175073
 
5.7%
u 167028
 
5.5%
o 161624
 
5.3%
s 159635
 
5.2%
t 149512
 
4.9%
Other values (49) 986219
32.2%

eventDate
Text

Missing 

Distinct19310
Distinct (%)18.6%
Missing82488
Missing (%)44.2%
Memory size1.4 MiB
2025-01-14T11:27:10.315149image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.409915322
Min length4

Characters and Unicode

Total characters979017
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7591 ?
Unique (%)7.3%

Sample

1st row1919-10-01
2nd row1822
3rd row1909-05-27
4th row1905-07-23
5th row1901-09-02
ValueCountFrequency (%)
1860/1864 747
 
0.7%
1822 660
 
0.6%
1920 497
 
0.5%
1914 302
 
0.3%
1875 288
 
0.3%
1893 280
 
0.3%
1902-08-20/1902-08-25 228
 
0.2%
1876 213
 
0.2%
1915 208
 
0.2%
1862 205
 
0.2%
Other values (19300) 100413
96.5%
2025-01-14T11:27:10.575147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 183000
18.7%
- 181446
18.5%
0 169513
17.3%
9 119086
12.2%
8 76571
7.8%
2 64991
 
6.6%
7 43898
 
4.5%
6 39271
 
4.0%
3 35897
 
3.7%
5 35553
 
3.6%
Other values (2) 29791
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 793868
81.1%
Dash Punctuation 181446
 
18.5%
Other Punctuation 3703
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 183000
23.1%
0 169513
21.4%
9 119086
15.0%
8 76571
9.6%
2 64991
 
8.2%
7 43898
 
5.5%
6 39271
 
4.9%
3 35897
 
4.5%
5 35553
 
4.5%
4 26088
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 181446
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3703
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 979017
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 183000
18.7%
- 181446
18.5%
0 169513
17.3%
9 119086
12.2%
8 76571
7.8%
2 64991
 
6.6%
7 43898
 
4.5%
6 39271
 
4.0%
3 35897
 
3.7%
5 35553
 
3.6%
Other values (2) 29791
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 979017
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 183000
18.7%
- 181446
18.5%
0 169513
17.3%
9 119086
12.2%
8 76571
7.8%
2 64991
 
6.6%
7 43898
 
4.5%
6 39271
 
4.0%
3 35897
 
3.7%
5 35553
 
3.6%
Other values (2) 29791
 
3.0%

year
Text

Missing 

Distinct207
Distinct (%)0.2%
Missing82575
Missing (%)44.3%
Memory size1.4 MiB
2025-01-14T11:27:10.789740image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters415816
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st row1919
2nd row1822
3rd row1909
4th row1905
5th row1901
ValueCountFrequency (%)
1903 3571
 
3.4%
1908 3439
 
3.3%
1906 3126
 
3.0%
1909 3111
 
3.0%
1907 2478
 
2.4%
1905 2459
 
2.4%
1902 2451
 
2.4%
1901 2263
 
2.2%
1904 2259
 
2.2%
1910 2166
 
2.1%
Other values (197) 76631
73.7%
2025-01-14T11:27:11.056336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 127395
30.6%
9 95678
23.0%
8 46344
 
11.1%
0 41582
 
10.0%
2 25600
 
6.2%
3 20302
 
4.9%
7 16990
 
4.1%
5 15248
 
3.7%
6 14626
 
3.5%
4 12051
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 415816
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 127395
30.6%
9 95678
23.0%
8 46344
 
11.1%
0 41582
 
10.0%
2 25600
 
6.2%
3 20302
 
4.9%
7 16990
 
4.1%
5 15248
 
3.7%
6 14626
 
3.5%
4 12051
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Common 415816
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 127395
30.6%
9 95678
23.0%
8 46344
 
11.1%
0 41582
 
10.0%
2 25600
 
6.2%
3 20302
 
4.9%
7 16990
 
4.1%
5 15248
 
3.7%
6 14626
 
3.5%
4 12051
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 415816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 127395
30.6%
9 95678
23.0%
8 46344
 
11.1%
0 41582
 
10.0%
2 25600
 
6.2%
3 20302
 
4.9%
7 16990
 
4.1%
5 15248
 
3.7%
6 14626
 
3.5%
4 12051
 
2.9%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing92324
Missing (%)49.5%
Memory size1.4 MiB
2025-01-14T11:27:11.118854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.090207526
Min length1

Characters and Unicode

Total characters102703
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row5
3rd row7
4th row9
5th row6
ValueCountFrequency (%)
8 18400
19.5%
7 17452
18.5%
6 15152
16.1%
9 13074
13.9%
5 10788
11.5%
10 5143
 
5.5%
4 4753
 
5.0%
3 2673
 
2.8%
11 2070
 
2.2%
2 1924
 
2.0%
Other values (2) 2776
 
2.9%
2025-01-14T11:27:11.222389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 18400
17.9%
7 17452
17.0%
6 15152
14.8%
9 13074
12.7%
1 12059
11.7%
5 10788
10.5%
0 5143
 
5.0%
4 4753
 
4.6%
2 3209
 
3.1%
3 2673
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 102703
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 18400
17.9%
7 17452
17.0%
6 15152
14.8%
9 13074
12.7%
1 12059
11.7%
5 10788
10.5%
0 5143
 
5.0%
4 4753
 
4.6%
2 3209
 
3.1%
3 2673
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 102703
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 18400
17.9%
7 17452
17.0%
6 15152
14.8%
9 13074
12.7%
1 12059
11.7%
5 10788
10.5%
0 5143
 
5.0%
4 4753
 
4.6%
2 3209
 
3.1%
3 2673
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 102703
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 18400
17.9%
7 17452
17.0%
6 15152
14.8%
9 13074
12.7%
1 12059
11.7%
5 10788
10.5%
0 5143
 
5.0%
4 4753
 
4.6%
2 3209
 
3.1%
3 2673
 
2.6%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing103361
Missing (%)55.4%
Memory size1.4 MiB
2025-01-14T11:27:11.290891image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.718689881
Min length1

Characters and Unicode

Total characters142940
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row27
3rd row23
4th row2
5th row24
ValueCountFrequency (%)
20 3585
 
4.3%
12 3146
 
3.8%
30 3055
 
3.7%
10 2971
 
3.6%
19 2958
 
3.6%
15 2952
 
3.5%
17 2883
 
3.5%
13 2839
 
3.4%
8 2833
 
3.4%
4 2794
 
3.4%
Other values (21) 53152
63.9%
2025-01-14T11:27:11.417720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 38104
26.7%
2 34622
24.2%
3 12250
 
8.6%
0 9611
 
6.7%
5 8235
 
5.8%
4 8225
 
5.8%
7 8185
 
5.7%
8 8057
 
5.6%
9 7878
 
5.5%
6 7773
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 142940
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 38104
26.7%
2 34622
24.2%
3 12250
 
8.6%
0 9611
 
6.7%
5 8235
 
5.8%
4 8225
 
5.8%
7 8185
 
5.7%
8 8057
 
5.6%
9 7878
 
5.5%
6 7773
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 142940
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 38104
26.7%
2 34622
24.2%
3 12250
 
8.6%
0 9611
 
6.7%
5 8235
 
5.8%
4 8225
 
5.8%
7 8185
 
5.7%
8 8057
 
5.6%
9 7878
 
5.5%
6 7773
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 142940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 38104
26.7%
2 34622
24.2%
3 12250
 
8.6%
0 9611
 
6.7%
5 8235
 
5.8%
4 8225
 
5.8%
7 8185
 
5.7%
8 8057
 
5.6%
9 7878
 
5.5%
6 7773
 
5.4%

habitat
Text

Missing 

Distinct14351
Distinct (%)49.8%
Missing157729
Missing (%)84.6%
Memory size1.4 MiB
2025-01-14T11:27:11.600709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length242
Median length188
Mean length21.31243056
Min length3

Characters and Unicode

Total characters613798
Distinct characters96
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11617 ?
Unique (%)40.3%

Sample

1st rowEarth
2nd rowEdge of lake; Moist soil
3rd rowOn high cliff
4th rowPrimary montane forest.
5th rowSur les arbres (on the trees)
ValueCountFrequency (%)
on 15468
 
13.4%
in 6301
 
5.5%
of 4480
 
3.9%
rocks 4154
 
3.6%
a 1945
 
1.7%
woods 1920
 
1.7%
wet 1736
 
1.5%
trees 1636
 
1.4%
and 1457
 
1.3%
tree 1397
 
1.2%
Other values (4244) 74614
64.8%
2025-01-14T11:27:11.865295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86308
14.1%
e 49629
 
8.1%
o 47740
 
7.8%
n 47247
 
7.7%
a 38799
 
6.3%
s 38696
 
6.3%
r 36877
 
6.0%
t 27255
 
4.4%
i 24024
 
3.9%
d 23439
 
3.8%
Other values (86) 193784
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 476716
77.7%
Space Separator 86308
 
14.1%
Uppercase Letter 35456
 
5.8%
Other Punctuation 12410
 
2.0%
Dash Punctuation 868
 
0.1%
Close Punctuation 824
 
0.1%
Open Punctuation 816
 
0.1%
Decimal Number 335
 
0.1%
Math Symbol 43
 
< 0.1%
Currency Symbol 11
 
< 0.1%
Other values (3) 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 49629
10.4%
o 47740
 
10.0%
n 47247
 
9.9%
a 38799
 
8.1%
s 38696
 
8.1%
r 36877
 
7.7%
t 27255
 
5.7%
i 24024
 
5.0%
d 23439
 
4.9%
l 20507
 
4.3%
Other values (17) 122503
25.7%
Uppercase Letter
ValueCountFrequency (%)
O 14615
41.2%
I 2748
 
7.8%
S 2363
 
6.7%
B 2146
 
6.1%
R 1640
 
4.6%
A 1512
 
4.3%
C 1306
 
3.7%
W 1225
 
3.5%
M 1196
 
3.4%
D 910
 
2.6%
Other values (17) 5795
 
16.3%
Other Punctuation
ValueCountFrequency (%)
. 5273
42.5%
, 3827
30.8%
; 2610
21.0%
/ 316
 
2.5%
& 127
 
1.0%
? 91
 
0.7%
" 73
 
0.6%
' 54
 
0.4%
: 34
 
0.3%
¡ 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 67
20.0%
2 55
16.4%
0 55
16.4%
3 52
15.5%
4 32
9.6%
6 20
 
6.0%
5 20
 
6.0%
9 15
 
4.5%
8 13
 
3.9%
7 6
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 796
97.5%
10
 
1.2%
[ 9
 
1.1%
{ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
+ 38
88.4%
= 4
 
9.3%
< 1
 
2.3%
Currency Symbol
ValueCountFrequency (%)
¤ 6
54.5%
¢ 3
27.3%
£ 2
 
18.2%
Initial Punctuation
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 866
99.8%
2
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 814
98.8%
] 10
 
1.2%
Final Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
86308
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 512172
83.4%
Common 101626
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 49629
 
9.7%
o 47740
 
9.3%
n 47247
 
9.2%
a 38799
 
7.6%
s 38696
 
7.6%
r 36877
 
7.2%
t 27255
 
5.3%
i 24024
 
4.7%
d 23439
 
4.6%
l 20507
 
4.0%
Other values (44) 157959
30.8%
Common
ValueCountFrequency (%)
86308
84.9%
. 5273
 
5.2%
, 3827
 
3.8%
; 2610
 
2.6%
- 866
 
0.9%
) 814
 
0.8%
( 796
 
0.8%
/ 316
 
0.3%
& 127
 
0.1%
? 91
 
0.1%
Other values (32) 598
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 613755
> 99.9%
Punctuation 22
 
< 0.1%
None 20
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
86308
14.1%
e 49629
 
8.1%
o 47740
 
7.8%
n 47247
 
7.7%
a 38799
 
6.3%
s 38696
 
6.3%
r 36877
 
6.0%
t 27255
 
4.4%
i 24024
 
3.9%
d 23439
 
3.8%
Other values (72) 193741
31.6%
Punctuation
ValueCountFrequency (%)
10
45.5%
3
 
13.6%
3
 
13.6%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
None
ValueCountFrequency (%)
¤ 6
30.0%
Š 3
15.0%
¡ 3
15.0%
ø 3
15.0%
¢ 3
15.0%
£ 2
 
10.0%
Modifier Letters
ValueCountFrequency (%)
ˆ 1
100.0%

higherGeography
Text

Missing 

Distinct3946
Distinct (%)3.4%
Missing72099
Missing (%)38.7%
Memory size1.4 MiB
2025-01-14T11:27:12.063952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length100
Median length90
Mean length51.15013545
Min length4

Characters and Unicode

Total characters5853110
Distinct characters68
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1598 ?
Unique (%)1.4%

Sample

1st rowNorth America; USA; Connecticut; New London County; Salem
2nd rowNorth America; USA; Connecticut; New Haven County; New Haven
3rd rowNorth America; Canada; British Columbia
4th rowNorth America; USA; Connecticut; Litchfield County; Washington
5th rowNorth America; USA; Connecticut; Hartford County; Southington
ValueCountFrequency (%)
north 111995
14.5%
america 109184
14.1%
usa 99054
12.8%
county 86823
11.2%
connecticut 62098
 
8.0%
new 41413
 
5.3%
haven 29950
 
3.9%
hartford 12411
 
1.6%
litchfield 10261
 
1.3%
fairfield 7167
 
0.9%
Other values (2947) 204171
26.4%
2025-01-14T11:27:12.331301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
660097
 
11.3%
t 419182
 
7.2%
o 390984
 
6.7%
; 389920
 
6.7%
n 368644
 
6.3%
e 355954
 
6.1%
r 341889
 
5.8%
a 317910
 
5.4%
i 314775
 
5.4%
c 274503
 
4.7%
Other values (58) 2019252
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3828543
65.4%
Uppercase Letter 972588
 
16.6%
Space Separator 660097
 
11.3%
Other Punctuation 390992
 
6.7%
Dash Punctuation 882
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 419182
10.9%
o 390984
10.2%
n 368644
9.6%
e 355954
9.3%
r 341889
8.9%
a 317910
8.3%
i 314775
8.2%
c 274503
 
7.2%
u 191407
 
5.0%
h 158690
 
4.1%
Other values (22) 694605
18.1%
Uppercase Letter
ValueCountFrequency (%)
A 217147
22.3%
C 174483
17.9%
N 158609
16.3%
S 122261
12.6%
U 100685
10.4%
H 51363
 
5.3%
M 24299
 
2.5%
L 22479
 
2.3%
W 14997
 
1.5%
F 14090
 
1.4%
Other values (16) 72175
 
7.4%
Other Punctuation
ValueCountFrequency (%)
; 389920
99.7%
' 560
 
0.1%
. 316
 
0.1%
& 196
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 2
50.0%
( 2
50.0%
Close Punctuation
ValueCountFrequency (%)
] 2
50.0%
) 2
50.0%
Space Separator
ValueCountFrequency (%)
660097
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 882
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4801131
82.0%
Common 1051979
 
18.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 419182
 
8.7%
o 390984
 
8.1%
n 368644
 
7.7%
e 355954
 
7.4%
r 341889
 
7.1%
a 317910
 
6.6%
i 314775
 
6.6%
c 274503
 
5.7%
A 217147
 
4.5%
u 191407
 
4.0%
Other values (48) 1608736
33.5%
Common
ValueCountFrequency (%)
660097
62.7%
; 389920
37.1%
- 882
 
0.1%
' 560
 
0.1%
. 316
 
< 0.1%
& 196
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%
( 2
 
< 0.1%
) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5852731
> 99.9%
None 379
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
660097
 
11.3%
t 419182
 
7.2%
o 390984
 
6.7%
; 389920
 
6.7%
n 368644
 
6.3%
e 355954
 
6.1%
r 341889
 
5.8%
a 317910
 
5.4%
i 314775
 
5.4%
c 274503
 
4.7%
Other values (52) 2018873
34.5%
None
ValueCountFrequency (%)
á 110
29.0%
í 98
25.9%
ü 97
25.6%
é 36
 
9.5%
ó 36
 
9.5%
ç 2
 
0.5%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing73523
Missing (%)39.4%
Memory size1.4 MiB
2025-01-14T11:27:12.389888image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.75156186
Min length4

Characters and Unicode

Total characters1441003
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorth America
2nd rowNorth America
3rd rowNorth America
4th rowNorth America
5th rowNorth America
ValueCountFrequency (%)
america 109184
49.1%
north 108434
48.8%
europe 1833
 
0.8%
asia 1008
 
0.5%
south 750
 
0.3%
oceania 679
 
0.3%
africa 298
 
0.1%
antarctica 4
 
< 0.1%
2025-01-14T11:27:12.491713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 219753
15.3%
a 111856
7.8%
e 111696
7.8%
i 111173
7.7%
o 111017
7.7%
A 110494
7.7%
c 110169
7.6%
t 109192
7.6%
h 109184
7.6%
109184
7.6%
Other values (10) 227285
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1109629
77.0%
Uppercase Letter 222190
 
15.4%
Space Separator 109184
 
7.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 219753
19.8%
a 111856
10.1%
e 111696
10.1%
i 111173
10.0%
o 111017
10.0%
c 110169
9.9%
t 109192
9.8%
h 109184
9.8%
m 109184
9.8%
u 2583
 
0.2%
Other values (4) 3822
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
A 110494
49.7%
N 108434
48.8%
E 1833
 
0.8%
S 750
 
0.3%
O 679
 
0.3%
Space Separator
ValueCountFrequency (%)
109184
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1331819
92.4%
Common 109184
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 219753
16.5%
a 111856
8.4%
e 111696
8.4%
i 111173
8.3%
o 111017
8.3%
A 110494
8.3%
c 110169
8.3%
t 109192
8.2%
h 109184
8.2%
m 109184
8.2%
Other values (9) 118101
8.9%
Common
ValueCountFrequency (%)
109184
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1441003
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 219753
15.3%
a 111856
7.8%
e 111696
7.8%
i 111173
7.7%
o 111017
7.7%
A 110494
7.7%
c 110169
7.6%
t 109192
7.6%
h 109184
7.6%
109184
7.6%
Other values (10) 227285
15.8%

waterBody
Text

Missing 

Distinct18
Distinct (%)0.6%
Missing183495
Missing (%)98.4%
Memory size1.4 MiB
2025-01-14T11:27:12.552844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length32
Mean length22.4996704
Min length12

Characters and Unicode

Total characters68264
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.2%

Sample

1st rowAtlantic Ocean; Caribbean Sea
2nd rowAtlantic Ocean; Sargasso Sea
3rd rowAtlantic Ocean
4th rowAtlantic Ocean; Caribbean Sea
5th rowAtlantic Ocean; Adriatic Sea
ValueCountFrequency (%)
ocean 3034
30.6%
atlantic 2509
25.3%
sea 1009
 
10.2%
caribbean 673
 
6.8%
long 503
 
5.1%
island 503
 
5.1%
sound 503
 
5.1%
pacific 450
 
4.5%
adriatic 126
 
1.3%
sargasso 123
 
1.2%
Other values (15) 478
 
4.8%
2025-01-14T11:27:12.670467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9455
13.9%
n 8029
11.8%
6877
10.1%
c 6672
9.8%
t 5226
 
7.7%
e 5055
 
7.4%
i 4589
 
6.7%
l 3120
 
4.6%
O 3036
 
4.4%
A 2636
 
3.9%
Other values (26) 13569
19.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49965
73.2%
Uppercase Letter 9805
 
14.4%
Space Separator 6877
 
10.1%
Other Punctuation 1617
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9455
18.9%
n 8029
16.1%
c 6672
13.4%
t 5226
10.5%
e 5055
10.1%
i 4589
9.2%
l 3120
 
6.2%
b 1347
 
2.7%
o 1339
 
2.7%
d 1289
 
2.6%
Other values (11) 3844
7.7%
Uppercase Letter
ValueCountFrequency (%)
O 3036
31.0%
A 2636
26.9%
S 1637
16.7%
C 675
 
6.9%
I 577
 
5.9%
L 503
 
5.1%
P 451
 
4.6%
M 176
 
1.8%
G 105
 
1.1%
R 6
 
0.1%
Other values (3) 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
6877
100.0%
Other Punctuation
ValueCountFrequency (%)
; 1617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59770
87.6%
Common 8494
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9455
15.8%
n 8029
13.4%
c 6672
11.2%
t 5226
8.7%
e 5055
8.5%
i 4589
7.7%
l 3120
 
5.2%
O 3036
 
5.1%
A 2636
 
4.4%
S 1637
 
2.7%
Other values (24) 10315
17.3%
Common
ValueCountFrequency (%)
6877
81.0%
; 1617
 
19.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9455
13.9%
n 8029
11.8%
6877
10.1%
c 6672
9.8%
t 5226
 
7.7%
e 5055
 
7.4%
i 4589
 
6.7%
l 3120
 
4.6%
O 3036
 
4.4%
A 2636
 
3.9%
Other values (26) 13569
19.9%

country
Text

Missing 

Distinct102
Distinct (%)0.1%
Missing72769
Missing (%)39.0%
Memory size1.4 MiB
2025-01-14T11:27:12.769637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length3
Mean length3.460979255
Min length3

Characters and Unicode

Total characters393721
Distinct characters51
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)< 0.1%

Sample

1st rowUSA
2nd rowUSA
3rd rowCanada
4th rowUSA
5th rowUSA
ValueCountFrequency (%)
usa 99054
86.3%
canada 6369
 
5.5%
mexico 1398
 
1.2%
cuba 1377
 
1.2%
china 732
 
0.6%
united 654
 
0.6%
kingdom 654
 
0.6%
australia 497
 
0.4%
france 481
 
0.4%
bermuda 435
 
0.4%
Other values (116) 3186
 
2.8%
2025-01-14T11:27:12.924557image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 99716
25.3%
U 99711
25.3%
S 99212
25.2%
a 27059
 
6.9%
n 10145
 
2.6%
d 8941
 
2.3%
C 8633
 
2.2%
i 5494
 
1.4%
e 4275
 
1.1%
u 3416
 
0.9%
Other values (41) 27119
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 312915
79.5%
Lowercase Letter 79725
 
20.2%
Space Separator 1077
 
0.3%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 27059
33.9%
n 10145
 
12.7%
d 8941
 
11.2%
i 5494
 
6.9%
e 4275
 
5.4%
u 3416
 
4.3%
r 2836
 
3.6%
o 2826
 
3.5%
c 2533
 
3.2%
m 1872
 
2.3%
Other values (16) 10328
 
13.0%
Uppercase Letter
ValueCountFrequency (%)
A 99716
31.9%
U 99711
31.9%
S 99212
31.7%
C 8633
 
2.8%
M 1620
 
0.5%
B 815
 
0.3%
K 671
 
0.2%
F 597
 
0.2%
P 317
 
0.1%
E 309
 
0.1%
Other values (12) 1314
 
0.4%
Space Separator
ValueCountFrequency (%)
1077
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
] 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 392640
99.7%
Common 1081
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 99716
25.4%
U 99711
25.4%
S 99212
25.3%
a 27059
 
6.9%
n 10145
 
2.6%
d 8941
 
2.3%
C 8633
 
2.2%
i 5494
 
1.4%
e 4275
 
1.1%
u 3416
 
0.9%
Other values (38) 26038
 
6.6%
Common
ValueCountFrequency (%)
1077
99.6%
[ 2
 
0.2%
] 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 393721
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 99716
25.3%
U 99711
25.3%
S 99212
25.2%
a 27059
 
6.9%
n 10145
 
2.6%
d 8941
 
2.3%
C 8633
 
2.2%
i 5494
 
1.4%
e 4275
 
1.1%
u 3416
 
0.9%
Other values (41) 27119
 
6.9%

stateProvince
Text

Missing 

Distinct228
Distinct (%)0.2%
Missing78016
Missing (%)41.8%
Memory size1.4 MiB
2025-01-14T11:27:13.097665image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length11
Mean length10.38558514
Min length4

Characters and Unicode

Total characters1126971
Distinct characters58
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)< 0.1%

Sample

1st rowConnecticut
2nd rowConnecticut
3rd rowBritish Columbia
4th rowConnecticut
5th rowConnecticut
ValueCountFrequency (%)
connecticut 62098
50.1%
new 5448
 
4.4%
california 3651
 
2.9%
michigan 3126
 
2.5%
florida 2732
 
2.2%
hampshire 2664
 
2.1%
massachusetts 2337
 
1.9%
maine 2073
 
1.7%
columbia 2034
 
1.6%
british 1905
 
1.5%
Other values (255) 35967
29.0%
2025-01-14T11:27:13.342815image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 153675
13.6%
t 143885
12.8%
c 135676
12.0%
i 108030
9.6%
o 93850
8.3%
e 90883
8.1%
u 72414
6.4%
C 69792
 
6.2%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (48) 177894
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 987704
87.6%
Uppercase Letter 123543
 
11.0%
Space Separator 15522
 
1.4%
Dash Punctuation 202
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 153675
15.6%
t 143885
14.6%
c 135676
13.7%
i 108030
10.9%
o 93850
9.5%
e 90883
9.2%
u 72414
7.3%
a 54867
 
5.6%
r 26005
 
2.6%
s 23900
 
2.4%
Other values (21) 84519
8.6%
Uppercase Letter
ValueCountFrequency (%)
C 69792
56.5%
M 9865
 
8.0%
N 8790
 
7.1%
H 4662
 
3.8%
S 3193
 
2.6%
F 2734
 
2.2%
W 2621
 
2.1%
V 2372
 
1.9%
B 2286
 
1.9%
P 2093
 
1.7%
Other values (15) 15135
 
12.3%
Space Separator
ValueCountFrequency (%)
15522
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 202
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1111247
98.6%
Common 15724
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 153675
13.8%
t 143885
12.9%
c 135676
12.2%
i 108030
9.7%
o 93850
8.4%
e 90883
8.2%
u 72414
6.5%
C 69792
6.3%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (46) 162170
14.6%
Common
ValueCountFrequency (%)
15522
98.7%
- 202
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1126602
> 99.9%
None 369
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 153675
13.6%
t 143885
12.8%
c 135676
12.0%
i 108030
9.6%
o 93850
8.3%
e 90883
8.1%
u 72414
6.4%
C 69792
 
6.2%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (43) 177525
15.8%
None
ValueCountFrequency (%)
á 107
29.0%
í 98
26.6%
ü 97
26.3%
ó 34
 
9.2%
é 33
 
8.9%

county
Text

Missing 

Distinct881
Distinct (%)1.0%
Missing98586
Missing (%)52.9%
Memory size1.4 MiB
2025-01-14T11:27:13.528479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length31
Mean length15.47382964
Min length4

Characters and Unicode

Total characters1360815
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique234 ?
Unique (%)0.3%

Sample

1st rowNew London County
2nd rowNew Haven County
3rd rowLitchfield County
4th rowHartford County
5th rowLitchfield County
ValueCountFrequency (%)
county 86823
42.1%
new 27711
 
13.4%
haven 21492
 
10.4%
hartford 10602
 
5.1%
litchfield 8892
 
4.3%
fairfield 6414
 
3.1%
london 6205
 
3.0%
middlesex 4458
 
2.2%
windham 2098
 
1.0%
tolland 1927
 
0.9%
Other values (919) 29493
 
14.3%
2025-01-14T11:27:13.778786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 142574
 
10.5%
o 128105
 
9.4%
118172
 
8.7%
t 116308
 
8.5%
u 93310
 
6.9%
C 90545
 
6.7%
e 90254
 
6.6%
y 88473
 
6.5%
a 63460
 
4.7%
d 48213
 
3.5%
Other values (49) 381401
28.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1035147
76.1%
Uppercase Letter 206666
 
15.2%
Space Separator 118172
 
8.7%
Dash Punctuation 444
 
< 0.1%
Other Punctuation 384
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 142574
13.8%
o 128105
12.4%
t 116308
11.2%
u 93310
9.0%
e 90254
8.7%
y 88473
8.5%
a 63460
 
6.1%
d 48213
 
4.7%
i 47877
 
4.6%
r 39508
 
3.8%
Other values (17) 177065
17.1%
Uppercase Letter
ValueCountFrequency (%)
C 90545
43.8%
H 33277
 
16.1%
N 28182
 
13.6%
L 16295
 
7.9%
M 7866
 
3.8%
F 7523
 
3.6%
S 4016
 
1.9%
W 3238
 
1.6%
T 2375
 
1.1%
B 2345
 
1.1%
Other values (16) 11004
 
5.3%
Other Punctuation
ValueCountFrequency (%)
. 223
58.1%
' 161
41.9%
Space Separator
ValueCountFrequency (%)
118172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 444
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1241813
91.3%
Common 119002
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 142574
11.5%
o 128105
 
10.3%
t 116308
 
9.4%
u 93310
 
7.5%
C 90545
 
7.3%
e 90254
 
7.3%
y 88473
 
7.1%
a 63460
 
5.1%
d 48213
 
3.9%
i 47877
 
3.9%
Other values (43) 332694
26.8%
Common
ValueCountFrequency (%)
118172
99.3%
- 444
 
0.4%
. 223
 
0.2%
' 161
 
0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1360813
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 142574
 
10.5%
o 128105
 
9.4%
118172
 
8.7%
t 116308
 
8.5%
u 93310
 
6.9%
C 90545
 
6.7%
e 90254
 
6.6%
y 88473
 
6.5%
a 63460
 
4.7%
d 48213
 
3.5%
Other values (48) 381399
28.0%
None
ValueCountFrequency (%)
ó 2
100.0%

municipality
Text

Missing 

Distinct2118
Distinct (%)2.8%
Missing110052
Missing (%)59.0%
Memory size1.4 MiB
2025-01-14T11:27:13.977498image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length30
Mean length8.966486656
Min length3

Characters and Unicode

Total characters685730
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique841 ?
Unique (%)1.1%

Sample

1st rowSalem
2nd rowNew Haven
3rd rowWashington
4th rowSouthington
5th rowCornwall
ValueCountFrequency (%)
haven 8458
 
8.7%
new 8105
 
8.3%
southington 2857
 
2.9%
north 2691
 
2.8%
east 2584
 
2.7%
guilford 2301
 
2.4%
salisbury 1956
 
2.0%
lyme 1877
 
1.9%
branford 1824
 
1.9%
hartford 1809
 
1.9%
Other values (2013) 62977
64.6%
2025-01-14T11:27:14.248870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 53847
 
7.9%
e 53791
 
7.8%
n 53538
 
7.8%
r 52706
 
7.7%
a 51213
 
7.5%
t 42856
 
6.2%
i 37612
 
5.5%
l 34531
 
5.0%
d 28780
 
4.2%
20962
 
3.1%
Other values (52) 255894
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 566373
82.6%
Uppercase Letter 97469
 
14.2%
Space Separator 20962
 
3.1%
Other Punctuation 688
 
0.1%
Dash Punctuation 236
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 53847
 
9.5%
e 53791
 
9.5%
n 53538
 
9.5%
r 52706
 
9.3%
a 51213
 
9.0%
t 42856
 
7.6%
i 37612
 
6.6%
l 34531
 
6.1%
d 28780
 
5.1%
s 19429
 
3.4%
Other values (19) 138070
24.4%
Uppercase Letter
ValueCountFrequency (%)
S 13453
13.8%
H 13413
13.8%
N 12984
13.3%
W 9137
9.4%
B 6850
 
7.0%
G 5994
 
6.1%
C 4838
 
5.0%
M 4772
 
4.9%
L 4537
 
4.7%
E 3457
 
3.5%
Other values (16) 18034
18.5%
Other Punctuation
ValueCountFrequency (%)
' 399
58.0%
& 196
28.5%
. 93
 
13.5%
Space Separator
ValueCountFrequency (%)
20962
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 236
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 663842
96.8%
Common 21888
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 53847
 
8.1%
e 53791
 
8.1%
n 53538
 
8.1%
r 52706
 
7.9%
a 51213
 
7.7%
t 42856
 
6.5%
i 37612
 
5.7%
l 34531
 
5.2%
d 28780
 
4.3%
s 19429
 
2.9%
Other values (45) 235539
35.5%
Common
ValueCountFrequency (%)
20962
95.8%
' 399
 
1.8%
- 236
 
1.1%
& 196
 
0.9%
. 93
 
0.4%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 685722
> 99.9%
None 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 53847
 
7.9%
e 53791
 
7.8%
n 53538
 
7.8%
r 52706
 
7.7%
a 51213
 
7.5%
t 42856
 
6.2%
i 37612
 
5.5%
l 34531
 
5.0%
d 28780
 
4.2%
20962
 
3.1%
Other values (49) 255886
37.3%
None
ValueCountFrequency (%)
á 3
37.5%
é 3
37.5%
ç 2
25.0%

locality
Text

Missing 

Distinct21415
Distinct (%)35.0%
Missing125307
Missing (%)67.2%
Memory size1.4 MiB
2025-01-14T11:27:14.442484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length351
Median length190
Mean length26.85875992
Min length3

Characters and Unicode

Total characters1644347
Distinct characters94
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14165 ?
Unique (%)23.1%

Sample

1st rownear Scotch Creek, Shushwap Lake
2nd rowSouth Shuttle Street
3rd rowCalumet Island, Timbalier Bay
4th rowRio Blanco
5th rowOak Hill
ValueCountFrequency (%)
of 13616
 
5.1%
near 6444
 
2.4%
island 5988
 
2.3%
river 4368
 
1.6%
lake 4069
 
1.5%
and 3691
 
1.4%
road 3087
 
1.2%
yale 3008
 
1.1%
mountains 2923
 
1.1%
west 2905
 
1.1%
Other values (11704) 214841
81.1%
2025-01-14T11:27:14.707185image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
204227
 
12.4%
a 141357
 
8.6%
e 131463
 
8.0%
o 116914
 
7.1%
n 109604
 
6.7%
r 89713
 
5.5%
t 83094
 
5.1%
i 75103
 
4.6%
l 66840
 
4.1%
s 66454
 
4.0%
Other values (84) 559578
34.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1197147
72.8%
Space Separator 204227
 
12.4%
Uppercase Letter 186670
 
11.4%
Other Punctuation 33882
 
2.1%
Decimal Number 13304
 
0.8%
Close Punctuation 3613
 
0.2%
Open Punctuation 3582
 
0.2%
Dash Punctuation 1450
 
0.1%
Other Symbol 423
 
< 0.1%
Math Symbol 49
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 141357
11.8%
e 131463
11.0%
o 116914
9.8%
n 109604
9.2%
r 89713
 
7.5%
t 83094
 
6.9%
i 75103
 
6.3%
l 66840
 
5.6%
s 66454
 
5.6%
d 50253
 
4.2%
Other values (24) 266352
22.2%
Uppercase Letter
ValueCountFrequency (%)
M 17688
 
9.5%
S 17239
 
9.2%
R 16853
 
9.0%
C 14862
 
8.0%
P 14723
 
7.9%
B 14281
 
7.7%
L 12462
 
6.7%
H 8806
 
4.7%
N 8375
 
4.5%
I 8098
 
4.3%
Other values (17) 53283
28.5%
Other Punctuation
ValueCountFrequency (%)
, 22303
65.8%
. 6163
 
18.2%
' 3859
 
11.4%
/ 345
 
1.0%
" 313
 
0.9%
; 267
 
0.8%
: 241
 
0.7%
? 164
 
0.5%
& 134
 
0.4%
# 92
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 2497
18.8%
2 1786
13.4%
0 1562
11.7%
3 1451
10.9%
5 1340
10.1%
4 1323
9.9%
7 900
 
6.8%
9 872
 
6.6%
6 818
 
6.1%
8 755
 
5.7%
Close Punctuation
ValueCountFrequency (%)
] 3215
89.0%
) 397
 
11.0%
} 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 3212
89.7%
( 369
 
10.3%
{ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 42
85.7%
+ 6
 
12.2%
> 1
 
2.0%
Space Separator
ValueCountFrequency (%)
204227
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1450
100.0%
Other Symbol
ValueCountFrequency (%)
° 423
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1383817
84.2%
Common 260530
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 141357
 
10.2%
e 131463
 
9.5%
o 116914
 
8.4%
n 109604
 
7.9%
r 89713
 
6.5%
t 83094
 
6.0%
i 75103
 
5.4%
l 66840
 
4.8%
s 66454
 
4.8%
d 50253
 
3.6%
Other values (51) 453022
32.7%
Common
ValueCountFrequency (%)
204227
78.4%
, 22303
 
8.6%
. 6163
 
2.4%
' 3859
 
1.5%
] 3215
 
1.2%
[ 3212
 
1.2%
1 2497
 
1.0%
2 1786
 
0.7%
0 1562
 
0.6%
3 1451
 
0.6%
Other values (23) 10255
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1643879
> 99.9%
None 468
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
204227
 
12.4%
a 141357
 
8.6%
e 131463
 
8.0%
o 116914
 
7.1%
n 109604
 
6.7%
r 89713
 
5.5%
t 83094
 
5.1%
i 75103
 
4.6%
l 66840
 
4.1%
s 66454
 
4.0%
Other values (74) 559110
34.0%
None
ValueCountFrequency (%)
° 423
90.4%
é 14
 
3.0%
á 9
 
1.9%
à 6
 
1.3%
í 6
 
1.3%
Î 4
 
0.9%
ñ 2
 
0.4%
ú 2
 
0.4%
ã 1
 
0.2%
ä 1
 
0.2%
Distinct684
Distinct (%)9.0%
Missing178933
Missing (%)95.9%
Memory size1.4 MiB
2025-01-14T11:27:14.902637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.396656135
Min length1

Characters and Unicode

Total characters25801
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique239 ?
Unique (%)3.1%

Sample

1st row564
2nd row1450
3rd row1012
4th row137
5th row1463
ValueCountFrequency (%)
1524 279
 
3.7%
305 237
 
3.1%
1219 204
 
2.7%
1829 195
 
2.6%
914 180
 
2.4%
2743 175
 
2.3%
366 170
 
2.2%
610 167
 
2.2%
762 151
 
2.0%
244 150
 
2.0%
Other values (674) 5688
74.9%
2025-01-14T11:27:15.157512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4443
17.2%
2 3804
14.7%
0 3792
14.7%
3 2534
9.8%
4 2370
9.2%
5 2298
8.9%
6 1897
7.4%
7 1658
 
6.4%
8 1544
 
6.0%
9 1461
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25801
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4443
17.2%
2 3804
14.7%
0 3792
14.7%
3 2534
9.8%
4 2370
9.2%
5 2298
8.9%
6 1897
7.4%
7 1658
 
6.4%
8 1544
 
6.0%
9 1461
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 25801
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4443
17.2%
2 3804
14.7%
0 3792
14.7%
3 2534
9.8%
4 2370
9.2%
5 2298
8.9%
6 1897
7.4%
7 1658
 
6.4%
8 1544
 
6.0%
9 1461
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25801
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4443
17.2%
2 3804
14.7%
0 3792
14.7%
3 2534
9.8%
4 2370
9.2%
5 2298
8.9%
6 1897
7.4%
7 1658
 
6.4%
8 1544
 
6.0%
9 1461
 
5.7%
Distinct102
Distinct (%)13.9%
Missing185793
Missing (%)99.6%
Memory size1.4 MiB
2025-01-14T11:27:15.282500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.585597826
Min length1

Characters and Unicode

Total characters2639
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)5.6%

Sample

1st row1550
2nd row1200
3rd row800
4th row800
5th row1035
ValueCountFrequency (%)
1155 62
 
8.4%
800 52
 
7.1%
1200 39
 
5.3%
3048 38
 
5.2%
900 38
 
5.2%
600 31
 
4.2%
300 30
 
4.1%
2438 28
 
3.8%
1829 27
 
3.7%
1380 26
 
3.5%
Other values (92) 365
49.6%
2025-01-14T11:27:15.462251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 744
28.2%
1 457
17.3%
2 294
 
11.1%
5 269
 
10.2%
8 226
 
8.6%
3 217
 
8.2%
4 153
 
5.8%
9 122
 
4.6%
6 83
 
3.1%
7 74
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2639
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 744
28.2%
1 457
17.3%
2 294
 
11.1%
5 269
 
10.2%
8 226
 
8.6%
3 217
 
8.2%
4 153
 
5.8%
9 122
 
4.6%
6 83
 
3.1%
7 74
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
Common 2639
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 744
28.2%
1 457
17.3%
2 294
 
11.1%
5 269
 
10.2%
8 226
 
8.6%
3 217
 
8.2%
4 153
 
5.8%
9 122
 
4.6%
6 83
 
3.1%
7 74
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2639
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 744
28.2%
1 457
17.3%
2 294
 
11.1%
5 269
 
10.2%
8 226
 
8.6%
3 217
 
8.2%
4 153
 
5.8%
9 122
 
4.6%
6 83
 
3.1%
7 74
 
2.8%

verbatimElevation
Text

Missing 

Distinct884
Distinct (%)11.6%
Missing178933
Missing (%)95.9%
Memory size1.4 MiB
2025-01-14T11:27:15.641853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length5.866508689
Min length3

Characters and Unicode

Total characters44562
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique369 ?
Unique (%)4.9%

Sample

1st row564 m
2nd row1450-1550 m
3rd row1012 m
4th row137 m
5th row1463 m
ValueCountFrequency (%)
m 7482
49.2%
1524 267
 
1.8%
305 236
 
1.6%
1219 190
 
1.3%
1829 179
 
1.2%
366 170
 
1.1%
914 167
 
1.1%
610 162
 
1.1%
2743 153
 
1.0%
244 150
 
1.0%
Other values (875) 6036
39.7%
2025-01-14T11:27:15.888754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7596
17.0%
m 7482
16.8%
0 5082
11.4%
1 4893
11.0%
2 3978
8.9%
3 2614
 
5.9%
5 2551
 
5.7%
4 2434
 
5.5%
6 1983
 
4.4%
8 1718
 
3.9%
Other values (5) 4231
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28520
64.0%
Lowercase Letter 7710
 
17.3%
Space Separator 7596
 
17.0%
Dash Punctuation 736
 
1.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5082
17.8%
1 4893
17.2%
2 3978
13.9%
3 2614
9.2%
5 2551
8.9%
4 2434
8.5%
6 1983
 
7.0%
8 1718
 
6.0%
7 1718
 
6.0%
9 1549
 
5.4%
Lowercase Letter
ValueCountFrequency (%)
m 7482
97.0%
f 114
 
1.5%
t 114
 
1.5%
Space Separator
ValueCountFrequency (%)
7596
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 736
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36852
82.7%
Latin 7710
 
17.3%

Most frequent character per script

Common
ValueCountFrequency (%)
7596
20.6%
0 5082
13.8%
1 4893
13.3%
2 3978
10.8%
3 2614
 
7.1%
5 2551
 
6.9%
4 2434
 
6.6%
6 1983
 
5.4%
8 1718
 
4.7%
7 1718
 
4.7%
Other values (2) 2285
 
6.2%
Latin
ValueCountFrequency (%)
m 7482
97.0%
f 114
 
1.5%
t 114
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44562
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7596
17.0%
m 7482
16.8%
0 5082
11.4%
1 4893
11.0%
2 3978
8.9%
3 2614
 
5.9%
5 2551
 
5.7%
4 2434
 
5.5%
6 1983
 
4.4%
8 1718
 
3.9%
Other values (5) 4231
9.5%

decimalLatitude
Text

Missing 

Distinct8330
Distinct (%)8.0%
Missing82100
Missing (%)44.0%
Memory size1.4 MiB
2025-01-14T11:27:16.099366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length7.736040755
Min length2

Characters and Unicode

Total characters807867
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4030 ?
Unique (%)3.9%

Sample

1st row41.4854
2nd row41.4070
3rd row51
4th row41.6523
5th row41.6050
ValueCountFrequency (%)
41.4070 1988
 
1.9%
41.305111 1951
 
1.9%
41.3114 1870
 
1.8%
41.6050 1661
 
1.6%
41.5583 1312
 
1.3%
41.6049 1164
 
1.1%
41.986 1069
 
1.0%
46.166667 1017
 
1.0%
41.6153 994
 
1.0%
41.7413 947
 
0.9%
Other values (8317) 90456
86.6%
2025-01-14T11:27:16.365721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 139581
17.3%
1 120894
15.0%
. 104044
12.9%
3 72292
8.9%
6 64333
8.0%
9 56147
7.0%
7 53782
 
6.7%
5 53269
 
6.6%
2 50955
 
6.3%
0 46101
 
5.7%
Other values (2) 46469
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 703018
87.0%
Other Punctuation 104044
 
12.9%
Dash Punctuation 805
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 139581
19.9%
1 120894
17.2%
3 72292
10.3%
6 64333
9.2%
9 56147
8.0%
7 53782
 
7.7%
5 53269
 
7.6%
2 50955
 
7.2%
0 46101
 
6.6%
8 45664
 
6.5%
Other Punctuation
ValueCountFrequency (%)
. 104044
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 805
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 807867
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 139581
17.3%
1 120894
15.0%
. 104044
12.9%
3 72292
8.9%
6 64333
8.0%
9 56147
7.0%
7 53782
 
6.7%
5 53269
 
6.6%
2 50955
 
6.3%
0 46101
 
5.7%
Other values (2) 46469
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 807867
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 139581
17.3%
1 120894
15.0%
. 104044
12.9%
3 72292
8.9%
6 64333
8.0%
9 56147
7.0%
7 53782
 
6.7%
5 53269
 
6.6%
2 50955
 
6.3%
0 46101
 
5.7%
Other values (2) 46469
 
5.8%

decimalLongitude
Text

Missing 

Distinct8319
Distinct (%)8.0%
Missing82100
Missing (%)44.0%
Memory size1.4 MiB
2025-01-14T11:27:16.569530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.700179069
Min length2

Characters and Unicode

Total characters908551
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4014 ?
Unique (%)3.8%

Sample

1st row-72.2664
2nd row-72.9316
3rd row-119
4th row-73.3145
5th row-72.88
ValueCountFrequency (%)
72.88 2825
 
2.7%
72.9316 1988
 
1.9%
72.920823 1951
 
1.9%
72.9247 1870
 
1.8%
73.1931 1368
 
1.3%
73.036 1211
 
1.2%
72.8575 1086
 
1.0%
73.4257 1069
 
1.0%
60.75 1048
 
1.0%
72.4831 902
 
0.9%
Other values (8306) 89111
85.3%
2025-01-14T11:27:16.827442image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 129712
14.3%
. 103986
11.4%
- 102523
11.3%
2 99687
11.0%
3 85457
9.4%
1 73639
8.1%
8 64186
7.1%
6 54809
6.0%
9 52723
5.8%
5 48537
 
5.3%
Other values (2) 93292
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 702042
77.3%
Other Punctuation 103986
 
11.4%
Dash Punctuation 102523
 
11.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 129712
18.5%
2 99687
14.2%
3 85457
12.2%
1 73639
10.5%
8 64186
9.1%
6 54809
7.8%
9 52723
7.5%
5 48537
 
6.9%
4 47479
 
6.8%
0 45813
 
6.5%
Other Punctuation
ValueCountFrequency (%)
. 103986
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 908551
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 129712
14.3%
. 103986
11.4%
- 102523
11.3%
2 99687
11.0%
3 85457
9.4%
1 73639
8.1%
8 64186
7.1%
6 54809
6.0%
9 52723
5.8%
5 48537
 
5.3%
Other values (2) 93292
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 908551
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 129712
14.3%
. 103986
11.4%
- 102523
11.3%
2 99687
11.0%
3 85457
9.4%
1 73639
8.1%
8 64186
7.1%
6 54809
6.0%
9 52723
5.8%
5 48537
 
5.3%
Other values (2) 93292
10.3%

geodeticDatum
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing82283
Missing (%)44.1%
Memory size1.4 MiB
2025-01-14T11:27:16.883223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters521230
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 103957
99.7%
nad27 286
 
0.3%
nad83 3
 
< 0.1%
2025-01-14T11:27:16.978422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 103960
19.9%
W 103957
19.9%
G 103957
19.9%
S 103957
19.9%
4 103957
19.9%
N 289
 
0.1%
A 289
 
0.1%
D 289
 
0.1%
2 286
 
0.1%
7 286
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 312738
60.0%
Decimal Number 208492
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 103957
33.2%
G 103957
33.2%
S 103957
33.2%
N 289
 
0.1%
A 289
 
0.1%
D 289
 
0.1%
Decimal Number
ValueCountFrequency (%)
8 103960
49.9%
4 103957
49.9%
2 286
 
0.1%
7 286
 
0.1%
3 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 312738
60.0%
Common 208492
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 103957
33.2%
G 103957
33.2%
S 103957
33.2%
N 289
 
0.1%
A 289
 
0.1%
D 289
 
0.1%
Common
ValueCountFrequency (%)
8 103960
49.9%
4 103957
49.9%
2 286
 
0.1%
7 286
 
0.1%
3 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 521230
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 103960
19.9%
W 103957
19.9%
G 103957
19.9%
S 103957
19.9%
4 103957
19.9%
N 289
 
0.1%
A 289
 
0.1%
D 289
 
0.1%
2 286
 
0.1%
7 286
 
0.1%
Distinct5431
Distinct (%)5.2%
Missing82137
Missing (%)44.0%
Memory size1.4 MiB
2025-01-14T11:27:17.170961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.20342555
Min length1

Characters and Unicode

Total characters438804
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2088 ?
Unique (%)2.0%

Sample

1st row7093
2nd row7710
3rd row1189
4th row7762
5th row7725
ValueCountFrequency (%)
7725 2825
 
2.7%
1851 2328
 
2.2%
7710 1992
 
1.9%
6384 1951
 
1.9%
7484 1870
 
1.8%
9878 1817
 
1.7%
5062 1804
 
1.7%
11151 1368
 
1.3%
6630 1312
 
1.3%
7184 1083
 
1.0%
Other values (5421) 86042
82.4%
2025-01-14T11:27:17.434395image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 58864
13.4%
7 50281
11.5%
0 48395
11.0%
5 47230
10.8%
8 43792
10.0%
6 41665
9.5%
4 40758
9.3%
3 37978
8.7%
2 35411
8.1%
9 34369
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 438743
> 99.9%
Other Punctuation 61
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 58864
13.4%
7 50281
11.5%
0 48395
11.0%
5 47230
10.8%
8 43792
10.0%
6 41665
9.5%
4 40758
9.3%
3 37978
8.7%
2 35411
8.1%
9 34369
7.8%
Other Punctuation
ValueCountFrequency (%)
. 61
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 438804
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 58864
13.4%
7 50281
11.5%
0 48395
11.0%
5 47230
10.8%
8 43792
10.0%
6 41665
9.5%
4 40758
9.3%
3 37978
8.7%
2 35411
8.1%
9 34369
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 438804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 58864
13.4%
7 50281
11.5%
0 48395
11.0%
5 47230
10.8%
8 43792
10.0%
6 41665
9.5%
4 40758
9.3%
3 37978
8.7%
2 35411
8.1%
9 34369
7.8%

georeferencedBy
Text

Missing 

Distinct6
Distinct (%)0.1%
Missing182211
Missing (%)97.7%
Memory size1.4 MiB
2025-01-14T11:27:17.507434image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length16.97522001
Min length13

Characters and Unicode

Total characters73299
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st rowAngus J. Mossman
2nd rowAngus J. Mossman
3rd rowAngus J. Mossman
4th rowAngus J. Mossman
5th rowAngus J. Mossman
ValueCountFrequency (%)
angus 2204
17.0%
j 2204
17.0%
mossman 2204
17.0%
patrick 2110
16.3%
w 2110
16.3%
sweeney 2110
16.3%
lynn 1
 
< 0.1%
a 1
 
< 0.1%
jones 1
 
< 0.1%
jesse 1
 
< 0.1%
Other values (6) 6
 
< 0.1%
2025-01-14T11:27:17.625376image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8634
 
11.8%
s 6616
 
9.0%
n 6522
 
8.9%
e 6336
 
8.6%
a 4318
 
5.9%
. 4316
 
5.9%
J 2206
 
3.0%
A 2205
 
3.0%
o 2205
 
3.0%
g 2204
 
3.0%
Other values (21) 27737
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47397
64.7%
Uppercase Letter 12952
 
17.7%
Space Separator 8634
 
11.8%
Other Punctuation 4316
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 6616
14.0%
n 6522
13.8%
e 6336
13.4%
a 4318
9.1%
o 2205
 
4.7%
g 2204
 
4.7%
u 2204
 
4.7%
m 2204
 
4.7%
r 2114
 
4.5%
w 2112
 
4.5%
Other values (8) 10562
22.3%
Uppercase Letter
ValueCountFrequency (%)
J 2206
17.0%
A 2205
17.0%
M 2204
17.0%
W 2110
16.3%
S 2110
16.3%
P 2110
16.3%
L 2
 
< 0.1%
E 2
 
< 0.1%
N 1
 
< 0.1%
F 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8634
100.0%
Other Punctuation
ValueCountFrequency (%)
. 4316
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60349
82.3%
Common 12950
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 6616
 
11.0%
n 6522
 
10.8%
e 6336
 
10.5%
a 4318
 
7.2%
J 2206
 
3.7%
A 2205
 
3.7%
o 2205
 
3.7%
g 2204
 
3.7%
u 2204
 
3.7%
M 2204
 
3.7%
Other values (19) 23329
38.7%
Common
ValueCountFrequency (%)
8634
66.7%
. 4316
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73299
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8634
 
11.8%
s 6616
 
9.0%
n 6522
 
8.9%
e 6336
 
8.6%
a 4318
 
5.9%
. 4316
 
5.9%
J 2206
 
3.0%
A 2205
 
3.0%
o 2205
 
3.0%
g 2204
 
3.0%
Other values (21) 27737
37.8%

georeferencedDate
Text

Missing 

Distinct43
Distinct (%)0.4%
Missing174887
Missing (%)93.8%
Memory size1.4 MiB
2025-01-14T11:27:17.693191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length8.869266449
Min length4

Characters and Unicode

Total characters103256
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st row2016-11-04
2nd row2023-06-13
3rd row2015
4th row2016-11-04
5th row2023-08-24
ValueCountFrequency (%)
2015 2193
18.8%
2016-11-04 1996
17.1%
2023-08-24 1867
16.0%
2016-06-23 1595
13.7%
2023-06-13 1462
12.6%
2024-05-18 1141
9.8%
2024-01-17 616
 
5.3%
2023-08-13 395
 
3.4%
2016-10-31 121
 
1.0%
2016-10-28 51
 
0.4%
Other values (33) 205
 
1.8%
2025-01-14T11:27:17.820384image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 84360
81.7%
Dash Punctuation 18896
 
18.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21119
25.0%
2 20850
24.7%
1 14751
17.5%
3 7402
 
8.8%
6 6972
 
8.3%
4 5713
 
6.8%
8 3515
 
4.2%
5 3345
 
4.0%
7 673
 
0.8%
9 20
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 18896
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 103256
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 103256
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

georeferenceProtocol
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing82331
Missing (%)44.1%
Memory size1.4 MiB
2025-01-14T11:27:17.868887image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.43626557
Min length11

Characters and Unicode

Total characters1712626
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowphysical resource
2nd rowdigital resource
3rd rowdigital resource
4th rowphysical resource
5th rowphysical resource
ValueCountFrequency (%)
resource 102675
49.6%
physical 53073
25.7%
digital 49602
24.0%
unspecified 1523
 
0.7%
2025-01-14T11:27:17.972500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 208396
12.2%
r 205350
12.0%
s 157271
9.2%
c 157271
9.2%
i 155323
9.1%
u 104198
 
6.1%
a 102675
 
6.0%
l 102675
 
6.0%
102675
 
6.0%
o 102675
 
6.0%
Other values (8) 314117
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1609951
94.0%
Space Separator 102675
 
6.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 208396
12.9%
r 205350
12.8%
s 157271
9.8%
c 157271
9.8%
i 155323
9.6%
u 104198
6.5%
a 102675
 
6.4%
l 102675
 
6.4%
o 102675
 
6.4%
p 54596
 
3.4%
Other values (7) 259521
16.1%
Space Separator
ValueCountFrequency (%)
102675
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1609951
94.0%
Common 102675
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 208396
12.9%
r 205350
12.8%
s 157271
9.8%
c 157271
9.8%
i 155323
9.6%
u 104198
6.5%
a 102675
 
6.4%
l 102675
 
6.4%
o 102675
 
6.4%
p 54596
 
3.4%
Other values (7) 259521
16.1%
Common
ValueCountFrequency (%)
102675
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1712626
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 208396
12.2%
r 205350
12.0%
s 157271
9.2%
c 157271
9.2%
i 155323
9.1%
u 104198
 
6.1%
a 102675
 
6.0%
l 102675
 
6.0%
102675
 
6.0%
o 102675
 
6.0%
Other values (8) 314117
18.3%

georeferenceSources
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing83888
Missing (%)45.0%
Memory size1.4 MiB
2025-01-14T11:27:18.030387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length15
Mean length14.92256506
Min length4

Characters and Unicode

Total characters1531667
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowtopographic map
2nd rowGEOLocate
3rd rowGEOLocate
4th rowtopographic map
5th rowtopographic map
ValueCountFrequency (%)
topographic 53027
25.2%
map 53027
25.2%
geolocate 31271
14.9%
usa 13341
 
6.3%
state 13210
 
6.3%
digital 13210
 
6.3%
data 13210
 
6.3%
resource 13210
 
6.3%
vertnet 1811
 
0.9%
unspecified 1524
 
0.7%
Other values (12) 3467
 
1.6%
2025-01-14T11:27:18.149222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 190363
12.4%
p 160611
10.5%
o 150929
 
9.9%
t 142161
 
9.3%
107667
 
7.0%
c 99042
 
6.5%
i 83723
 
5.5%
e 77911
 
5.1%
r 68240
 
4.5%
g 66434
 
4.3%
Other values (30) 384586
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1196078
78.1%
Uppercase Letter 227398
 
14.8%
Space Separator 107667
 
7.0%
Decimal Number 524
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 190363
15.9%
p 160611
13.4%
o 150929
12.6%
t 142161
11.9%
c 99042
8.3%
i 83723
7.0%
e 77911
6.5%
r 68240
 
5.7%
g 66434
 
5.6%
h 53210
 
4.4%
Other values (8) 103454
8.6%
Uppercase Letter
ValueCountFrequency (%)
G 32808
14.4%
E 31836
14.0%
O 31271
13.8%
L 31271
13.8%
S 27769
12.2%
D 26420
11.6%
R 13341
5.9%
U 13341
5.9%
A 13341
5.9%
V 2062
 
0.9%
Other values (7) 3938
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 131
25.0%
4 131
25.0%
0 131
25.0%
2 131
25.0%
Space Separator
ValueCountFrequency (%)
107667
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1423476
92.9%
Common 108191
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 190363
13.4%
p 160611
11.3%
o 150929
10.6%
t 142161
10.0%
c 99042
 
7.0%
i 83723
 
5.9%
e 77911
 
5.5%
r 68240
 
4.8%
g 66434
 
4.7%
h 53210
 
3.7%
Other values (25) 330852
23.2%
Common
ValueCountFrequency (%)
107667
99.5%
1 131
 
0.1%
4 131
 
0.1%
0 131
 
0.1%
2 131
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1531667
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 190363
12.4%
p 160611
10.5%
o 150929
 
9.9%
t 142161
 
9.3%
107667
 
7.0%
c 99042
 
6.5%
i 83723
 
5.5%
e 77911
 
5.1%
r 68240
 
4.5%
g 66434
 
4.3%
Other values (30) 384586
25.1%

georeferenceRemarks
Text

Missing 

Distinct6514
Distinct (%)6.4%
Missing85474
Missing (%)45.8%
Memory size1.4 MiB
2025-01-14T11:27:18.286164image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length465
Median length15
Mean length63.16058582
Min length2

Characters and Unicode

Total characters6382693
Distinct characters85
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3825 ?
Unique (%)3.8%

Sample

1st rowfrom CT DEP Map
2nd rowex Argus
3rd rowjlsanesdoc (2015-07-14 13:16:00); Geolocated to Shuswap Lake
4th rowfrom CT DEP Map
5th rowfrom CT DEP Map
ValueCountFrequency (%)
the 81840
 
8.6%
from 60217
 
6.3%
ct 56024
 
5.9%
map 53164
 
5.6%
dep 53055
 
5.6%
of 33115
 
3.5%
centroid 28035
 
3.0%
polygon 28033
 
3.0%
uncertainty 18145
 
1.9%
database 17580
 
1.9%
Other values (8113) 520344
54.8%
2025-01-14T11:27:18.497904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
848525
 
13.3%
e 492230
 
7.7%
o 402931
 
6.3%
t 360014
 
5.6%
a 343662
 
5.4%
r 306928
 
4.8%
n 285528
 
4.5%
i 227311
 
3.6%
s 197953
 
3.1%
d 177049
 
2.8%
Other values (75) 2740562
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4013075
62.9%
Space Separator 848525
 
13.3%
Uppercase Letter 636507
 
10.0%
Decimal Number 495160
 
7.8%
Other Punctuation 236193
 
3.7%
Dash Punctuation 59535
 
0.9%
Open Punctuation 42944
 
0.7%
Close Punctuation 42942
 
0.7%
Connector Punctuation 7578
 
0.1%
Math Symbol 233
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 492230
12.3%
o 402931
 
10.0%
t 360014
 
9.0%
a 343662
 
8.6%
r 306928
 
7.6%
n 285528
 
7.1%
i 227311
 
5.7%
s 197953
 
4.9%
d 177049
 
4.4%
l 138359
 
3.4%
Other values (17) 1081110
26.9%
Uppercase Letter
ValueCountFrequency (%)
C 87444
13.7%
T 81216
12.8%
D 77234
12.1%
M 76938
12.1%
P 67525
10.6%
E 62274
9.8%
A 34574
 
5.4%
G 34244
 
5.4%
N 23701
 
3.7%
S 14621
 
2.3%
Other values (16) 76736
12.1%
Other Punctuation
ValueCountFrequency (%)
. 81586
34.5%
: 52775
22.3%
, 50428
21.4%
/ 28633
 
12.1%
; 21546
 
9.1%
& 772
 
0.3%
' 213
 
0.1%
" 120
 
0.1%
? 88
 
< 0.1%
% 27
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 116785
23.6%
2 103300
20.9%
1 87747
17.7%
5 44518
 
9.0%
4 35937
 
7.3%
6 29500
 
6.0%
3 24561
 
5.0%
8 24163
 
4.9%
7 14864
 
3.0%
9 13785
 
2.8%
Math Symbol
ValueCountFrequency (%)
= 177
76.0%
+ 55
 
23.6%
~ 1
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 42943
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 42941
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
848525
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59535
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7578
100.0%
Other Symbol
ValueCountFrequency (%)
¦ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4649582
72.8%
Common 1733111
 
27.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 492230
 
10.6%
o 402931
 
8.7%
t 360014
 
7.7%
a 343662
 
7.4%
r 306928
 
6.6%
n 285528
 
6.1%
i 227311
 
4.9%
s 197953
 
4.3%
d 177049
 
3.8%
l 138359
 
3.0%
Other values (43) 1717617
36.9%
Common
ValueCountFrequency (%)
848525
49.0%
0 116785
 
6.7%
2 103300
 
6.0%
1 87747
 
5.1%
. 81586
 
4.7%
- 59535
 
3.4%
: 52775
 
3.0%
, 50428
 
2.9%
5 44518
 
2.6%
( 42943
 
2.5%
Other values (22) 244969
 
14.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6382686
> 99.9%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
848525
 
13.3%
e 492230
 
7.7%
o 402931
 
6.3%
t 360014
 
5.6%
a 343662
 
5.4%
r 306928
 
4.8%
n 285528
 
4.5%
i 227311
 
3.6%
s 197953
 
3.1%
d 177049
 
2.8%
Other values (73) 2740555
42.9%
None
ValueCountFrequency (%)
ÿ 6
85.7%
¦ 1
 
14.3%

typeStatus
Text

Missing 

Distinct16
Distinct (%)0.4%
Missing182607
Missing (%)97.9%
Memory size1.4 MiB
2025-01-14T11:27:18.559143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.242988271
Min length4

Characters and Unicode

Total characters28407
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st rowisotype
2nd rowisosyntype
3rd rowisotype
4th rowisotype
5th rowisotype
ValueCountFrequency (%)
isotype 2414
61.6%
syntype 851
 
21.7%
isolectotype 201
 
5.1%
type 197
 
5.0%
isosyntype 103
 
2.6%
holotype 90
 
2.3%
paratype 19
 
0.5%
lectotype 17
 
0.4%
cotype 16
 
0.4%
isoneotype 6
 
0.2%
Other values (3) 8
 
0.2%
2025-01-14T11:27:18.666402image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 4876
17.2%
e 4147
14.6%
t 4146
14.6%
p 3949
13.9%
s 3680
13.0%
o 3158
11.1%
i 2727
9.6%
n 960
 
3.4%
l 308
 
1.1%
c 234
 
0.8%
Other values (4) 222
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28335
99.7%
Other Punctuation 72
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 4876
17.2%
e 4147
14.6%
t 4146
14.6%
p 3949
13.9%
s 3680
13.0%
o 3158
11.1%
i 2727
9.6%
n 960
 
3.4%
l 308
 
1.1%
c 234
 
0.8%
Other values (3) 150
 
0.5%
Other Punctuation
ValueCountFrequency (%)
? 72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28335
99.7%
Common 72
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 4876
17.2%
e 4147
14.6%
t 4146
14.6%
p 3949
13.9%
s 3680
13.0%
o 3158
11.1%
i 2727
9.6%
n 960
 
3.4%
l 308
 
1.1%
c 234
 
0.8%
Other values (3) 150
 
0.5%
Common
ValueCountFrequency (%)
? 72
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28407
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 4876
17.2%
e 4147
14.6%
t 4146
14.6%
p 3949
13.9%
s 3680
13.0%
o 3158
11.1%
i 2727
9.6%
n 960
 
3.4%
l 308
 
1.1%
c 234
 
0.8%
Other values (4) 222
 
0.8%

identifiedBy
Text

Missing 

Distinct193
Distinct (%)3.2%
Missing180415
Missing (%)96.7%
Memory size1.4 MiB
2025-01-14T11:27:18.848252image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length20
Mean length16.31534184
Min length5

Characters and Unicode

Total characters99752
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)0.9%

Sample

1st rowMartin C. Van Boskirk
2nd rowAlexander W. Evans
3rd rowMason E. Hale
4th rowAlexander W. Evans
5th rowM. H. Lewis
ValueCountFrequency (%)
w 1055
 
5.9%
alexander 744
 
4.2%
evans 744
 
4.2%
george 644
 
3.6%
f 634
 
3.6%
j 597
 
3.4%
c 484
 
2.7%
k 480
 
2.7%
h 421
 
2.4%
carl 419
 
2.4%
Other values (324) 11577
65.0%
2025-01-14T11:27:19.113693image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11685
 
11.7%
e 8430
 
8.5%
r 7926
 
7.9%
a 6479
 
6.5%
n 5716
 
5.7%
l 5553
 
5.6%
. 5495
 
5.5%
o 5002
 
5.0%
i 4237
 
4.2%
s 3571
 
3.6%
Other values (44) 35658
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 64705
64.9%
Uppercase Letter 17846
 
17.9%
Space Separator 11685
 
11.7%
Other Punctuation 5495
 
5.5%
Dash Punctuation 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 8430
13.0%
r 7926
12.2%
a 6479
10.0%
n 5716
8.8%
l 5553
8.6%
o 5002
7.7%
i 4237
 
6.5%
s 3571
 
5.5%
t 3188
 
4.9%
d 2155
 
3.3%
Other values (18) 12448
19.2%
Uppercase Letter
ValueCountFrequency (%)
W 1964
11.0%
A 1863
10.4%
M 1766
9.9%
C 1598
 
9.0%
E 1480
 
8.3%
G 1140
 
6.4%
B 1130
 
6.3%
J 997
 
5.6%
H 913
 
5.1%
F 862
 
4.8%
Other values (13) 4133
23.2%
Space Separator
ValueCountFrequency (%)
11685
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5495
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 82551
82.8%
Common 17201
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 8430
 
10.2%
r 7926
 
9.6%
a 6479
 
7.8%
n 5716
 
6.9%
l 5553
 
6.7%
o 5002
 
6.1%
i 4237
 
5.1%
s 3571
 
4.3%
t 3188
 
3.9%
d 2155
 
2.6%
Other values (41) 30294
36.7%
Common
ValueCountFrequency (%)
11685
67.9%
. 5495
31.9%
- 21
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99635
99.9%
None 117
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11685
 
11.7%
e 8430
 
8.5%
r 7926
 
8.0%
a 6479
 
6.5%
n 5716
 
5.7%
l 5553
 
5.6%
. 5495
 
5.5%
o 5002
 
5.0%
i 4237
 
4.3%
s 3571
 
3.6%
Other values (42) 35541
35.7%
None
ValueCountFrequency (%)
á 105
89.7%
é 12
 
10.3%

dateIdentified
Text

Missing 

Distinct85
Distinct (%)4.4%
Missing184582
Missing (%)99.0%
Memory size1.4 MiB
2025-01-14T11:27:19.226336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters7788
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)1.0%

Sample

1st row1997
2nd row1946
3rd row1946
4th row1946
5th row1995
ValueCountFrequency (%)
1995 414
21.3%
1997 349
17.9%
1984 135
 
6.9%
1954 102
 
5.2%
1956 80
 
4.1%
1946 63
 
3.2%
1962 61
 
3.1%
1953 60
 
3.1%
1957 54
 
2.8%
1979 52
 
2.7%
Other values (75) 577
29.6%
2025-01-14T11:27:19.377759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 2681
34.4%
1 1886
24.2%
5 908
 
11.7%
7 588
 
7.6%
8 337
 
4.3%
2 334
 
4.3%
4 333
 
4.3%
0 312
 
4.0%
6 285
 
3.7%
3 124
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7788
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 2681
34.4%
1 1886
24.2%
5 908
 
11.7%
7 588
 
7.6%
8 337
 
4.3%
2 334
 
4.3%
4 333
 
4.3%
0 312
 
4.0%
6 285
 
3.7%
3 124
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 7788
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 2681
34.4%
1 1886
24.2%
5 908
 
11.7%
7 588
 
7.6%
8 337
 
4.3%
2 334
 
4.3%
4 333
 
4.3%
0 312
 
4.0%
6 285
 
3.7%
3 124
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7788
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 2681
34.4%
1 1886
24.2%
5 908
 
11.7%
7 588
 
7.6%
8 337
 
4.3%
2 334
 
4.3%
4 333
 
4.3%
0 312
 
4.0%
6 285
 
3.7%
3 124
 
1.6%

identificationRemarks
Text

Missing 

Distinct2949
Distinct (%)79.8%
Missing182833
Missing (%)98.0%
Memory size1.4 MiB
2025-01-14T11:27:19.563196image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length283
Median length167
Mean length48.17316017
Min length9

Characters and Unicode

Total characters178048
Distinct characters88
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2406 ?
Unique (%)65.1%

Sample

1st rowProc. Amer. Acad. Arts. 22: 420. 1887.
2nd rowMem. Amer. Acad. Arts. n.s. 520. 1862.
3rd rowPl. Wright. (Grisebach) 1: 173. 1860.
4th rowProc. Amer. Acad. 22: 428. 1887.
5th rowProceedings of the American Academy of Arts and Sciences. 7: 381. 1868.
ValueCountFrequency (%)
of 1913
 
6.4%
the 1010
 
3.4%
arts 820
 
2.8%
acad 670
 
2.3%
amer 663
 
2.2%
american 639
 
2.1%
and 627
 
2.1%
academy 614
 
2.1%
sciences 570
 
1.9%
proc 563
 
1.9%
Other values (2176) 21659
72.8%
2025-01-14T11:27:19.822229image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26052
 
14.6%
. 14090
 
7.9%
e 10440
 
5.9%
a 8607
 
4.8%
1 7541
 
4.2%
o 7231
 
4.1%
r 7176
 
4.0%
n 6647
 
3.7%
t 6588
 
3.7%
i 6274
 
3.5%
Other values (78) 77402
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 87279
49.0%
Decimal Number 30944
 
17.4%
Space Separator 26052
 
14.6%
Other Punctuation 17631
 
9.9%
Uppercase Letter 14635
 
8.2%
Dash Punctuation 555
 
0.3%
Close Punctuation 470
 
0.3%
Open Punctuation 470
 
0.3%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10440
12.0%
a 8607
9.9%
o 7231
 
8.3%
r 7176
 
8.2%
n 6647
 
7.6%
t 6588
 
7.5%
i 6274
 
7.2%
c 5785
 
6.6%
s 4551
 
5.2%
l 4141
 
4.7%
Other values (23) 19839
22.7%
Uppercase Letter
ValueCountFrequency (%)
A 3827
26.1%
P 2135
14.6%
S 1494
 
10.2%
C 1489
 
10.2%
B 1022
 
7.0%
G 628
 
4.3%
N 580
 
4.0%
F 472
 
3.2%
M 426
 
2.9%
R 325
 
2.2%
Other values (16) 2237
15.3%
Other Punctuation
ValueCountFrequency (%)
. 14090
79.9%
: 2970
 
16.8%
, 359
 
2.0%
; 90
 
0.5%
' 75
 
0.4%
& 29
 
0.2%
" 13
 
0.1%
# 2
 
< 0.1%
/ 2
 
< 0.1%
? 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 7541
24.4%
8 5251
17.0%
6 3239
10.5%
2 2901
 
9.4%
7 2254
 
7.3%
9 2082
 
6.7%
3 2071
 
6.7%
4 1961
 
6.3%
5 1948
 
6.3%
0 1696
 
5.5%
Dash Punctuation
ValueCountFrequency (%)
- 530
95.5%
25
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 311
66.2%
] 159
33.8%
Open Punctuation
ValueCountFrequency (%)
( 310
66.0%
[ 160
34.0%
Math Symbol
ValueCountFrequency (%)
= 10
83.3%
+ 2
 
16.7%
Space Separator
ValueCountFrequency (%)
26052
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 101914
57.2%
Common 76134
42.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10440
 
10.2%
a 8607
 
8.4%
o 7231
 
7.1%
r 7176
 
7.0%
n 6647
 
6.5%
t 6588
 
6.5%
i 6274
 
6.2%
c 5785
 
5.7%
s 4551
 
4.5%
l 4141
 
4.1%
Other values (49) 34474
33.8%
Common
ValueCountFrequency (%)
26052
34.2%
. 14090
18.5%
1 7541
 
9.9%
8 5251
 
6.9%
6 3239
 
4.3%
: 2970
 
3.9%
2 2901
 
3.8%
7 2254
 
3.0%
9 2082
 
2.7%
3 2071
 
2.7%
Other values (19) 7683
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 177975
> 99.9%
None 48
 
< 0.1%
Punctuation 25
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26052
 
14.6%
. 14090
 
7.9%
e 10440
 
5.9%
a 8607
 
4.8%
1 7541
 
4.2%
o 7231
 
4.1%
r 7176
 
4.0%
n 6647
 
3.7%
t 6588
 
3.7%
i 6274
 
3.5%
Other values (70) 77329
43.4%
None
ValueCountFrequency (%)
ü 25
52.1%
é 9
 
18.8%
ö 8
 
16.7%
è 2
 
4.2%
ä 2
 
4.2%
ë 1
 
2.1%
ñ 1
 
2.1%
Punctuation
ValueCountFrequency (%)
25
100.0%
Distinct16379
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:20.020975image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length43
Mean length15.95681637
Min length3

Characters and Unicode

Total characters2976409
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7549 ?
Unique (%)4.0%

Sample

1st rowLuzula bulbosa
2nd rowGentiana clausa
3rd rowCarex muhlenbergii
4th rowLophocolea minor
5th rowPlantae
ValueCountFrequency (%)
plantae 28374
 
8.6%
carex 8803
 
2.7%
var 3699
 
1.1%
dryopteris 2392
 
0.7%
sphagnum 2360
 
0.7%
juncus 1814
 
0.5%
frullania 1708
 
0.5%
asplenium 1557
 
0.5%
scapania 1517
 
0.5%
canadensis 1515
 
0.5%
Other values (11105) 276305
83.7%
2025-01-14T11:27:20.290576image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 389969
13.1%
i 262458
 
8.8%
e 205819
 
6.9%
l 196972
 
6.6%
r 175234
 
5.9%
n 170772
 
5.7%
u 162576
 
5.5%
o 156926
 
5.3%
s 154857
 
5.2%
t 146008
 
4.9%
Other values (48) 954818
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2641527
88.7%
Uppercase Letter 186514
 
6.3%
Space Separator 143515
 
4.8%
Other Punctuation 4146
 
0.1%
Dash Punctuation 705
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 389969
14.8%
i 262458
9.9%
e 205819
 
7.8%
l 196972
 
7.5%
r 175234
 
6.6%
n 170772
 
6.5%
u 162576
 
6.2%
o 156926
 
5.9%
s 154857
 
5.9%
t 146008
 
5.5%
Other values (16) 619936
23.5%
Uppercase Letter
ValueCountFrequency (%)
P 49351
26.5%
C 26040
14.0%
S 16978
 
9.1%
A 13951
 
7.5%
L 10862
 
5.8%
D 7781
 
4.2%
R 6989
 
3.7%
E 6742
 
3.6%
B 6396
 
3.4%
M 6026
 
3.2%
Other values (16) 35398
19.0%
Other Punctuation
ValueCountFrequency (%)
. 4144
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
143515
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 705
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2828041
95.0%
Common 148368
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 389969
13.8%
i 262458
 
9.3%
e 205819
 
7.3%
l 196972
 
7.0%
r 175234
 
6.2%
n 170772
 
6.0%
u 162576
 
5.7%
o 156926
 
5.5%
s 154857
 
5.5%
t 146008
 
5.2%
Other values (42) 806450
28.5%
Common
ValueCountFrequency (%)
143515
96.7%
. 4144
 
2.8%
- 705
 
0.5%
? 2
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2976409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 389969
13.1%
i 262458
 
8.8%
e 205819
 
6.9%
l 196972
 
6.6%
r 175234
 
5.9%
n 170772
 
5.7%
u 162576
 
5.5%
o 156926
 
5.3%
s 154857
 
5.2%
t 146008
 
4.9%
Other values (48) 954818
32.1%
Distinct792
Distinct (%)0.4%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-14T11:27:20.474439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length119
Median length91
Mean length47.98150243
Min length5

Characters and Unicode

Total characters8949078
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)< 0.1%

Sample

1st rowPlantae; Tracheophyta; Poales; Juncaceae
2nd rowPlantae; Tracheophyta; Asteridae; Gentianales; Gentianaceae
3rd rowPlantae; Tracheophyta; Poales; Cyperaceae
4th rowPlantae; Bryophyta; Hepaticopsida; Jungermanniales; Lophocoleaceae
5th rowPlantae
ValueCountFrequency (%)
plantae 177514
22.7%
tracheophyta 104057
 
13.3%
bryophyta 37100
 
4.7%
poales 23133
 
3.0%
hepaticopsida 21780
 
2.8%
asteridae 20956
 
2.7%
rosidae 18590
 
2.4%
jungermanniales 16103
 
2.1%
polypodiales 14202
 
1.8%
cyperaceae 13776
 
1.8%
Other values (1036) 334890
42.8%
2025-01-14T11:27:20.732773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1419556
15.9%
e 1079293
 
12.1%
; 595590
 
6.7%
595590
 
6.7%
t 468738
 
5.2%
l 453031
 
5.1%
o 447517
 
5.0%
c 388641
 
4.3%
r 357029
 
4.0%
h 342137
 
3.8%
Other values (43) 2801956
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6975797
77.9%
Uppercase Letter 782101
 
8.7%
Other Punctuation 595590
 
6.7%
Space Separator 595590
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1419556
20.3%
e 1079293
15.5%
t 468738
 
6.7%
l 453031
 
6.5%
o 447517
 
6.4%
c 388641
 
5.6%
r 357029
 
5.1%
h 342137
 
4.9%
i 338965
 
4.9%
n 337540
 
4.8%
Other values (16) 1343350
19.3%
Uppercase Letter
ValueCountFrequency (%)
P 253748
32.4%
T 107375
13.7%
A 69797
 
8.9%
B 65121
 
8.3%
C 46187
 
5.9%
R 42144
 
5.4%
F 32277
 
4.1%
H 30368
 
3.9%
L 25762
 
3.3%
J 22652
 
2.9%
Other values (15) 86670
 
11.1%
Other Punctuation
ValueCountFrequency (%)
; 595590
100.0%
Space Separator
ValueCountFrequency (%)
595590
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7757898
86.7%
Common 1191180
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1419556
18.3%
e 1079293
13.9%
t 468738
 
6.0%
l 453031
 
5.8%
o 447517
 
5.8%
c 388641
 
5.0%
r 357029
 
4.6%
h 342137
 
4.4%
i 338965
 
4.4%
n 337540
 
4.4%
Other values (41) 2125451
27.4%
Common
ValueCountFrequency (%)
; 595590
50.0%
595590
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8949078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1419556
15.9%
e 1079293
 
12.1%
; 595590
 
6.7%
595590
 
6.7%
t 468738
 
5.2%
l 453031
 
5.1%
o 447517
 
5.0%
c 388641
 
4.3%
r 357029
 
4.0%
h 342137
 
3.8%
Other values (43) 2801956
31.3%
Distinct6
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-14T11:27:20.791645image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.981170011
Min length5

Characters and Unicode

Total characters1302065
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 177514
95.2%
fungi 5158
 
2.8%
chromista 2965
 
1.6%
bacteria 869
 
0.5%
protozoa 3
 
< 0.1%
animalia 2
 
< 0.1%
2025-01-14T11:27:20.906656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 359738
27.6%
n 182674
14.0%
t 181351
13.9%
e 178383
13.7%
P 177517
13.6%
l 177516
13.6%
i 8996
 
0.7%
F 5158
 
0.4%
u 5158
 
0.4%
g 5158
 
0.4%
Other values (10) 20416
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1115554
85.7%
Uppercase Letter 186511
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 359738
32.2%
n 182674
16.4%
t 181351
16.3%
e 178383
16.0%
l 177516
15.9%
i 8996
 
0.8%
u 5158
 
0.5%
g 5158
 
0.5%
r 3837
 
0.3%
o 2974
 
0.3%
Other values (5) 9769
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
P 177517
95.2%
F 5158
 
2.8%
C 2965
 
1.6%
B 869
 
0.5%
A 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1302065
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 359738
27.6%
n 182674
14.0%
t 181351
13.9%
e 178383
13.7%
P 177517
13.6%
l 177516
13.6%
i 8996
 
0.7%
F 5158
 
0.4%
u 5158
 
0.4%
g 5158
 
0.4%
Other values (10) 20416
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1302065
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 359738
27.6%
n 182674
14.0%
t 181351
13.9%
e 178383
13.7%
P 177517
13.6%
l 177516
13.6%
i 8996
 
0.7%
F 5158
 
0.4%
u 5158
 
0.4%
g 5158
 
0.4%
Other values (10) 20416
 
1.6%

phylum
Text

Missing 

Distinct16
Distinct (%)< 0.1%
Missing28457
Missing (%)15.3%
Memory size1.4 MiB
2025-01-14T11:27:20.959041image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length12
Mean length11.11053191
Min length5

Characters and Unicode

Total characters1756264
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowTracheophyta
2nd rowTracheophyta
3rd rowTracheophyta
4th rowBryophyta
5th rowTracheophyta
ValueCountFrequency (%)
tracheophyta 104057
65.8%
bryophyta 37100
 
23.5%
rhodophyta 5572
 
3.5%
ascomycota 5095
 
3.2%
ochrophyta 2939
 
1.9%
chlorophyta 1791
 
1.1%
cyanobacteria 867
 
0.5%
charophyta 612
 
0.4%
bacillariophyta 26
 
< 0.1%
basidiomycota 4
 
< 0.1%
Other values (6) 9
 
< 0.1%
2025-01-14T11:27:21.067648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
h 267069
15.2%
a 264532
15.1%
y 195164
11.1%
o 170535
9.7%
t 158067
9.0%
p 152098
8.7%
r 147394
8.4%
c 118089
6.7%
e 104930
 
6.0%
T 104057
 
5.9%
Other values (19) 74329
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1598192
91.0%
Uppercase Letter 158072
 
9.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h 267069
16.7%
a 264532
16.6%
y 195164
12.2%
o 170535
10.7%
t 158067
9.9%
p 152098
9.5%
r 147394
9.2%
c 118089
7.4%
e 104930
 
6.6%
d 5576
 
0.3%
Other values (9) 14738
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
T 104057
65.8%
B 37130
 
23.5%
R 5572
 
3.5%
A 5097
 
3.2%
C 3270
 
2.1%
O 2939
 
1.9%
E 3
 
< 0.1%
M 2
 
< 0.1%
G 1
 
< 0.1%
F 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1756264
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
h 267069
15.2%
a 264532
15.1%
y 195164
11.1%
o 170535
9.7%
t 158067
9.0%
p 152098
8.7%
r 147394
8.4%
c 118089
6.7%
e 104930
 
6.0%
T 104057
 
5.9%
Other values (19) 74329
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1756264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
h 267069
15.2%
a 264532
15.1%
y 195164
11.1%
o 170535
9.7%
t 158067
9.0%
p 152098
8.7%
r 147394
8.4%
c 118089
6.7%
e 104930
 
6.0%
T 104057
 
5.9%
Other values (19) 74329
 
4.2%

class
Text

Missing 

Distinct38
Distinct (%)0.1%
Missing132536
Missing (%)71.1%
Memory size1.4 MiB
2025-01-14T11:27:21.129264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length12.33595096
Min length9

Characters and Unicode

Total characters666055
Distinct characters37
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowHepaticopsida
2nd rowUlvophyceae
3rd rowFlorideophyceae
4th rowUlvophyceae
5th rowHepaticopsida
ValueCountFrequency (%)
hepaticopsida 21780
40.3%
bryopsida 12436
23.0%
florideophyceae 5413
 
10.0%
lecanoromycetes 4682
 
8.7%
phaeophyceae 2902
 
5.4%
sphagnopsida 2360
 
4.4%
ulvophyceae 1414
 
2.6%
cyanophyceae 867
 
1.6%
anthocerotopsida 428
 
0.8%
charophyceae 324
 
0.6%
Other values (28) 1387
 
2.6%
2025-01-14T11:27:21.247447image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 82413
12.4%
p 73091
11.0%
e 69552
10.4%
o 65769
9.9%
i 64991
9.8%
c 43786
 
6.6%
d 42708
 
6.4%
s 42236
 
6.3%
y 30456
 
4.6%
t 28458
 
4.3%
Other values (27) 122595
18.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 612062
91.9%
Uppercase Letter 53993
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 82413
13.5%
p 73091
11.9%
e 69552
11.4%
o 65769
10.7%
i 64991
10.6%
c 43786
7.2%
d 42708
7.0%
s 42236
6.9%
y 30456
 
5.0%
t 28458
 
4.6%
Other values (10) 68602
11.2%
Uppercase Letter
ValueCountFrequency (%)
H 21780
40.3%
B 12597
23.3%
F 5413
 
10.0%
L 4698
 
8.7%
P 2907
 
5.4%
S 2362
 
4.4%
C 1514
 
2.8%
U 1414
 
2.6%
A 617
 
1.1%
Z 279
 
0.5%
Other values (7) 412
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 666055
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 82413
12.4%
p 73091
11.0%
e 69552
10.4%
o 65769
9.9%
i 64991
9.8%
c 43786
 
6.6%
d 42708
 
6.4%
s 42236
 
6.3%
y 30456
 
4.6%
t 28458
 
4.3%
Other values (27) 122595
18.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 666055
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 82413
12.4%
p 73091
11.0%
e 69552
10.4%
o 65769
9.9%
i 64991
9.8%
c 43786
 
6.6%
d 42708
 
6.4%
s 42236
 
6.3%
y 30456
 
4.6%
t 28458
 
4.3%
Other values (27) 122595
18.4%

order
Text

Missing 

Distinct190
Distinct (%)0.1%
Missing34477
Missing (%)18.5%
Memory size1.4 MiB
2025-01-14T11:27:21.402775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14
Mean length10.17703812
Min length6

Characters and Unicode

Total characters1547439
Distinct characters47
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowPoales
2nd rowGentianales
3rd rowPoales
4th rowJungermanniales
5th rowLamiales
ValueCountFrequency (%)
poales 23133
 
15.2%
jungermanniales 16103
 
10.6%
polypodiales 14202
 
9.3%
asterales 7347
 
4.8%
asparagales 5678
 
3.7%
fabales 5478
 
3.6%
lamiales 4706
 
3.1%
hypnales 4509
 
3.0%
rosales 3883
 
2.6%
caryophyllales 3363
 
2.2%
Other values (180) 63650
41.9%
2025-01-14T11:27:21.626215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 235502
15.2%
e 200260
12.9%
l 198936
12.9%
s 184853
11.9%
i 83937
 
5.4%
o 79637
 
5.1%
n 78001
 
5.0%
r 64696
 
4.2%
P 42317
 
2.7%
p 40657
 
2.6%
Other values (37) 338643
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1395387
90.2%
Uppercase Letter 152052
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 235502
16.9%
e 200260
14.4%
l 198936
14.3%
s 184853
13.2%
i 83937
 
6.0%
o 79637
 
5.7%
n 78001
 
5.6%
r 64696
 
4.6%
p 40657
 
2.9%
g 35268
 
2.5%
Other values (15) 193640
13.9%
Uppercase Letter
ValueCountFrequency (%)
P 42317
27.8%
A 16851
 
11.1%
J 16103
 
10.6%
L 10444
 
6.9%
F 9513
 
6.3%
C 9232
 
6.1%
M 8980
 
5.9%
R 6168
 
4.1%
S 6063
 
4.0%
H 5491
 
3.6%
Other values (12) 20890
13.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1547439
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 235502
15.2%
e 200260
12.9%
l 198936
12.9%
s 184853
11.9%
i 83937
 
5.4%
o 79637
 
5.1%
n 78001
 
5.0%
r 64696
 
4.2%
P 42317
 
2.7%
p 40657
 
2.6%
Other values (37) 338643
21.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1547439
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 235502
15.2%
e 200260
12.9%
l 198936
12.9%
s 184853
11.9%
i 83937
 
5.4%
o 79637
 
5.1%
n 78001
 
5.0%
r 64696
 
4.2%
P 42317
 
2.7%
p 40657
 
2.6%
Other values (37) 338643
21.9%

family
Text

Missing 

Distinct709
Distinct (%)0.4%
Missing28617
Missing (%)15.3%
Memory size1.4 MiB
2025-01-14T11:27:21.777409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length17
Mean length11.41779599
Min length7

Characters and Unicode

Total characters1803007
Distinct characters51
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)< 0.1%

Sample

1st rowJuncaceae
2nd rowGentianaceae
3rd rowCyperaceae
4th rowLophocoleaceae
5th rowPhrymaceae
ValueCountFrequency (%)
cyperaceae 13776
 
8.7%
asteraceae 7281
 
4.6%
poaceae 6277
 
4.0%
fabaceae 4752
 
3.0%
dryopteridaceae 3543
 
2.2%
jungermanniaceae 3535
 
2.2%
orchidaceae 3466
 
2.2%
rosaceae 3290
 
2.1%
pteridaceae 3106
 
2.0%
sphagnaceae 2360
 
1.5%
Other values (699) 106526
67.5%
2025-01-14T11:27:21.991525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 393344
21.8%
a 387162
21.5%
c 195648
10.9%
i 91861
 
5.1%
r 80955
 
4.5%
o 69117
 
3.8%
n 58000
 
3.2%
l 57730
 
3.2%
p 43539
 
2.4%
t 43209
 
2.4%
Other values (41) 382442
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1645095
91.2%
Uppercase Letter 157912
 
8.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 393344
23.9%
a 387162
23.5%
c 195648
11.9%
i 91861
 
5.6%
r 80955
 
4.9%
o 69117
 
4.2%
n 58000
 
3.5%
l 57730
 
3.5%
p 43539
 
2.6%
t 43209
 
2.6%
Other values (16) 224530
13.6%
Uppercase Letter
ValueCountFrequency (%)
C 25112
15.9%
P 24851
15.7%
A 19547
12.4%
S 10478
 
6.6%
L 10361
 
6.6%
R 9931
 
6.3%
F 8348
 
5.3%
O 7251
 
4.6%
D 6833
 
4.3%
B 6285
 
4.0%
Other values (15) 28915
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1803007
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 393344
21.8%
a 387162
21.5%
c 195648
10.9%
i 91861
 
5.1%
r 80955
 
4.5%
o 69117
 
3.8%
n 58000
 
3.2%
l 57730
 
3.2%
p 43539
 
2.4%
t 43209
 
2.4%
Other values (41) 382442
21.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1803007
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 393344
21.8%
a 387162
21.5%
c 195648
10.9%
i 91861
 
5.1%
r 80955
 
4.5%
o 69117
 
3.8%
n 58000
 
3.2%
l 57730
 
3.2%
p 43539
 
2.4%
t 43209
 
2.4%
Other values (41) 382442
21.2%

genus
Text

Missing 

Distinct3722
Distinct (%)2.4%
Missing28567
Missing (%)15.3%
Memory size1.4 MiB
2025-01-14T11:27:22.186307image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.583070612
Min length3

Characters and Unicode

Total characters1355799
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1001 ?
Unique (%)0.6%

Sample

1st rowLuzula
2nd rowGentiana
3rd rowCarex
4th rowLophocolea
5th rowMimulus
ValueCountFrequency (%)
carex 8803
 
5.6%
dryopteris 2365
 
1.5%
sphagnum 2360
 
1.5%
juncus 1814
 
1.1%
frullania 1708
 
1.1%
asplenium 1557
 
1.0%
scapania 1517
 
1.0%
sargassum 1504
 
1.0%
polypodium 1453
 
0.9%
panicum 1316
 
0.8%
Other values (3712) 133565
84.6%
2025-01-14T11:27:22.444719image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 157452
 
11.6%
i 121621
 
9.0%
e 90445
 
6.7%
o 90346
 
6.7%
r 87987
 
6.5%
u 80938
 
6.0%
l 79564
 
5.9%
s 65608
 
4.8%
n 63564
 
4.7%
m 62512
 
4.6%
Other values (42) 455762
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1197837
88.3%
Uppercase Letter 157962
 
11.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 157452
13.1%
i 121621
10.2%
e 90445
 
7.6%
o 90346
 
7.5%
r 87987
 
7.3%
u 80938
 
6.8%
l 79564
 
6.6%
s 65608
 
5.5%
n 63564
 
5.3%
m 62512
 
5.2%
Other values (16) 297800
24.9%
Uppercase Letter
ValueCountFrequency (%)
C 26037
16.5%
P 20967
13.3%
S 16977
10.7%
A 13942
 
8.8%
L 10858
 
6.9%
D 7781
 
4.9%
R 6988
 
4.4%
E 6742
 
4.3%
B 6370
 
4.0%
M 5935
 
3.8%
Other values (16) 35365
22.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1355799
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 157452
 
11.6%
i 121621
 
9.0%
e 90445
 
6.7%
o 90346
 
6.7%
r 87987
 
6.5%
u 80938
 
6.0%
l 79564
 
5.9%
s 65608
 
4.8%
n 63564
 
4.7%
m 62512
 
4.6%
Other values (42) 455762
33.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1355799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 157452
 
11.6%
i 121621
 
9.0%
e 90445
 
6.7%
o 90346
 
6.7%
r 87987
 
6.5%
u 80938
 
6.0%
l 79564
 
5.9%
s 65608
 
4.8%
n 63564
 
4.7%
m 62512
 
4.6%
Other values (42) 455762
33.6%

subgenus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing186528
Missing (%)> 99.9%
Memory size1.4 MiB
2025-01-14T11:27:22.501754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCyclosorus
ValueCountFrequency (%)
cyclosorus 1
100.0%
2025-01-14T11:27:22.598685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2
20.0%
s 2
20.0%
C 1
10.0%
y 1
10.0%
c 1
10.0%
l 1
10.0%
r 1
10.0%
u 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
22.2%
s 2
22.2%
y 1
11.1%
c 1
11.1%
l 1
11.1%
r 1
11.1%
u 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2
20.0%
s 2
20.0%
C 1
10.0%
y 1
10.0%
c 1
10.0%
l 1
10.0%
r 1
10.0%
u 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2
20.0%
s 2
20.0%
C 1
10.0%
y 1
10.0%
c 1
10.0%
l 1
10.0%
r 1
10.0%
u 1
10.0%

specificEpithet
Text

Missing 

Distinct7092
Distinct (%)5.3%
Missing52532
Missing (%)28.2%
Memory size1.4 MiB
2025-01-14T11:27:22.746809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length23
Mean length9.07722561
Min length3

Characters and Unicode

Total characters1216321
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2415 ?
Unique (%)1.8%

Sample

1st rowbulbosa
2nd rowclausa
3rd rowmuhlenbergii
4th rowminor
5th rowringens
ValueCountFrequency (%)
canadensis 1478
 
1.1%
virginiana 722
 
0.5%
palustris 710
 
0.5%
canadense 699
 
0.5%
americana 682
 
0.5%
virginica 544
 
0.4%
pubescens 507
 
0.4%
heterophylla 501
 
0.4%
virginianum 500
 
0.4%
nemorosa 495
 
0.4%
Other values (7037) 127605
94.9%
2025-01-14T11:27:22.966278image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 165616
13.6%
i 135826
11.2%
s 85814
 
7.1%
l 85712
 
7.0%
e 83593
 
6.9%
r 80561
 
6.6%
u 78876
 
6.5%
n 75852
 
6.2%
t 66207
 
5.4%
o 64052
 
5.3%
Other values (19) 294212
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1215173
99.9%
Dash Punctuation 701
 
0.1%
Space Separator 446
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 165616
13.6%
i 135826
11.2%
s 85814
 
7.1%
l 85712
 
7.1%
e 83593
 
6.9%
r 80561
 
6.6%
u 78876
 
6.5%
n 75852
 
6.2%
t 66207
 
5.4%
o 64052
 
5.3%
Other values (16) 293064
24.1%
Dash Punctuation
ValueCountFrequency (%)
- 701
100.0%
Space Separator
ValueCountFrequency (%)
446
100.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1215174
99.9%
Common 1147
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 165616
13.6%
i 135826
11.2%
s 85814
 
7.1%
l 85712
 
7.1%
e 83593
 
6.9%
r 80561
 
6.6%
u 78876
 
6.5%
n 75852
 
6.2%
t 66207
 
5.4%
o 64052
 
5.3%
Other values (17) 293065
24.1%
Common
ValueCountFrequency (%)
- 701
61.1%
446
38.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1216321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 165616
13.6%
i 135826
11.2%
s 85814
 
7.1%
l 85712
 
7.0%
e 83593
 
6.9%
r 80561
 
6.6%
u 78876
 
6.5%
n 75852
 
6.2%
t 66207
 
5.4%
o 64052
 
5.3%
Other values (19) 294212
24.2%

infraspecificEpithet
Text

Missing 

Distinct124
Distinct (%)15.2%
Missing185713
Missing (%)99.6%
Memory size1.4 MiB
2025-01-14T11:27:23.112584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length13
Mean length9.598039216
Min length6

Characters and Unicode

Total characters7832
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)6.9%

Sample

1st rowgigantospermum
2nd rowdepressa
3rd rowpennsylvanica
4th rowhaenseleri
5th rowlaevigatus
ValueCountFrequency (%)
pennsylvanica 92
 
11.3%
americana 64
 
7.8%
lanceolatum 58
 
7.1%
hastata 36
 
4.4%
strigosus 32
 
3.9%
pulchra 31
 
3.8%
cricumvagum 29
 
3.6%
canadensis 28
 
3.4%
ovatum 25
 
3.1%
tsugetorum 24
 
2.9%
Other values (114) 397
48.7%
2025-01-14T11:27:23.319146image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1259
16.1%
n 687
 
8.8%
i 641
 
8.2%
s 629
 
8.0%
e 542
 
6.9%
c 532
 
6.8%
u 479
 
6.1%
l 464
 
5.9%
t 415
 
5.3%
r 405
 
5.2%
Other values (16) 1779
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7832
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1259
16.1%
n 687
 
8.8%
i 641
 
8.2%
s 629
 
8.0%
e 542
 
6.9%
c 532
 
6.8%
u 479
 
6.1%
l 464
 
5.9%
t 415
 
5.3%
r 405
 
5.2%
Other values (16) 1779
22.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 7832
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1259
16.1%
n 687
 
8.8%
i 641
 
8.2%
s 629
 
8.0%
e 542
 
6.9%
c 532
 
6.8%
u 479
 
6.1%
l 464
 
5.9%
t 415
 
5.3%
r 405
 
5.2%
Other values (16) 1779
22.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1259
16.1%
n 687
 
8.8%
i 641
 
8.2%
s 629
 
8.0%
e 542
 
6.9%
c 532
 
6.8%
u 479
 
6.1%
l 464
 
5.9%
t 415
 
5.3%
r 405
 
5.2%
Other values (16) 1779
22.7%
Distinct12
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-14T11:27:23.630358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.750609884
Min length5

Characters and Unicode

Total characters1259063
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSpecies
2nd rowSpecies
3rd rowSpecies
4th rowSpecies
5th rowKingdom
ValueCountFrequency (%)
species 129091
69.2%
kingdom 28403
 
15.2%
genus 23966
 
12.8%
variety 3700
 
2.0%
subspecies 814
 
0.4%
forma 391
 
0.2%
order 92
 
< 0.1%
class 22
 
< 0.1%
family 19
 
< 0.1%
division 9
 
< 0.1%
Other values (2) 4
 
< 0.1%
2025-01-14T11:27:23.736661image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 287571
22.8%
i 162055
12.9%
s 154738
12.3%
S 129906
10.3%
p 129905
10.3%
c 129905
10.3%
n 52378
 
4.2%
m 28814
 
2.3%
o 28803
 
2.3%
d 28498
 
2.3%
Other values (17) 126490
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1072552
85.2%
Uppercase Letter 186511
 
14.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 287571
26.8%
i 162055
15.1%
s 154738
14.4%
p 129905
12.1%
c 129905
12.1%
n 52378
 
4.9%
m 28814
 
2.7%
o 28803
 
2.7%
d 28498
 
2.7%
g 28403
 
2.6%
Other values (9) 41482
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
S 129906
69.7%
K 28403
 
15.2%
G 23966
 
12.8%
V 3700
 
2.0%
F 410
 
0.2%
O 92
 
< 0.1%
C 25
 
< 0.1%
D 9
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1259063
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 287571
22.8%
i 162055
12.9%
s 154738
12.3%
S 129906
10.3%
p 129905
10.3%
c 129905
10.3%
n 52378
 
4.2%
m 28814
 
2.3%
o 28803
 
2.3%
d 28498
 
2.3%
Other values (17) 126490
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1259063
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 287571
22.8%
i 162055
12.9%
s 154738
12.3%
S 129906
10.3%
p 129905
10.3%
c 129905
10.3%
n 52378
 
4.2%
m 28814
 
2.3%
o 28803
 
2.3%
d 28498
 
2.3%
Other values (17) 126490
10.0%
Distinct5421
Distinct (%)3.6%
Missing36820
Missing (%)19.7%
Memory size1.4 MiB
2025-01-14T11:27:23.932387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length58
Median length48
Mean length10.06824573
Min length2

Characters and Unicode

Total characters1507307
Distinct characters93
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1789 ?
Unique (%)1.2%

Sample

1st row(Alph. Wood) Smyth & Smyth
2nd rowRaf.
3rd rowSchkuhr.
4th rowNees
5th rowL.
ValueCountFrequency (%)
l 41922
 
15.6%
ex 9869
 
3.7%
linnaeus 8731
 
3.2%
7867
 
2.9%
a 5064
 
1.9%
hedw 4935
 
1.8%
michx 4843
 
1.8%
willd 4414
 
1.6%
dumort 3945
 
1.5%
gray 3826
 
1.4%
Other values (2448) 173558
64.5%
2025-01-14T11:27:24.195468image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 170008
 
11.3%
119262
 
7.9%
e 93740
 
6.2%
n 72976
 
4.8%
r 68079
 
4.5%
L 64183
 
4.3%
i 60756
 
4.0%
a 58874
 
3.9%
( 54629
 
3.6%
) 54628
 
3.6%
Other values (83) 690172
45.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 824847
54.7%
Uppercase Letter 260511
 
17.3%
Other Punctuation 181769
 
12.1%
Space Separator 119271
 
7.9%
Open Punctuation 55083
 
3.7%
Close Punctuation 54661
 
3.6%
Decimal Number 9476
 
0.6%
Control 858
 
0.1%
Final Punctuation 456
 
< 0.1%
Dash Punctuation 374
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 93740
 
11.4%
n 72976
 
8.8%
r 68079
 
8.3%
i 60756
 
7.4%
a 58874
 
7.1%
l 49370
 
6.0%
o 49280
 
6.0%
h 48951
 
5.9%
t 44981
 
5.5%
s 42212
 
5.1%
Other values (27) 235628
28.6%
Uppercase Letter
ValueCountFrequency (%)
L 64183
24.6%
S 24732
 
9.5%
M 18669
 
7.2%
A 17663
 
6.8%
H 15973
 
6.1%
B 15740
 
6.0%
W 14011
 
5.4%
G 12213
 
4.7%
D 11513
 
4.4%
R 8920
 
3.4%
Other values (16) 56894
21.8%
Decimal Number
ValueCountFrequency (%)
1 2393
25.3%
7 2022
21.3%
3 1774
18.7%
6 1430
15.1%
9 661
 
7.0%
8 451
 
4.8%
0 328
 
3.5%
5 277
 
2.9%
4 75
 
0.8%
2 65
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 170008
93.5%
& 7863
 
4.3%
, 3814
 
2.1%
' 59
 
< 0.1%
? 24
 
< 0.1%
\ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 54629
99.2%
406
 
0.7%
[ 37
 
0.1%
11
 
< 0.1%
Control
ValueCountFrequency (%)
 705
82.2%
 147
 
17.1%
 6
 
0.7%
Space Separator
ValueCountFrequency (%)
119262
> 99.9%
  9
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 54628
99.9%
] 33
 
0.1%
Final Punctuation
ValueCountFrequency (%)
456
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 374
100.0%
Math Symbol
ValueCountFrequency (%)
¬ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1085358
72.0%
Common 421949
 
28.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 93740
 
8.6%
n 72976
 
6.7%
r 68079
 
6.3%
L 64183
 
5.9%
i 60756
 
5.6%
a 58874
 
5.4%
l 49370
 
4.5%
o 49280
 
4.5%
h 48951
 
4.5%
t 44981
 
4.1%
Other values (53) 474168
43.7%
Common
ValueCountFrequency (%)
. 170008
40.3%
119262
28.3%
( 54629
 
12.9%
) 54628
 
12.9%
& 7863
 
1.9%
, 3814
 
0.9%
1 2393
 
0.6%
7 2022
 
0.5%
3 1774
 
0.4%
6 1430
 
0.3%
Other values (20) 4126
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1505465
99.9%
None 969
 
0.1%
Punctuation 873
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 170008
 
11.3%
119262
 
7.9%
e 93740
 
6.2%
n 72976
 
4.8%
r 68079
 
4.5%
L 64183
 
4.3%
i 60756
 
4.0%
a 58874
 
3.9%
( 54629
 
3.6%
) 54628
 
3.6%
Other values (64) 688330
45.7%
None
ValueCountFrequency (%)
 705
72.8%
 147
 
15.2%
é 42
 
4.3%
ä 21
 
2.2%
ü 16
 
1.7%
  9
 
0.9%
è 6
 
0.6%
 6
 
0.6%
ó 5
 
0.5%
ÿ 4
 
0.4%
Other values (6) 8
 
0.8%
Punctuation
ValueCountFrequency (%)
456
52.2%
406
46.5%
11
 
1.3%
Distinct179
Distinct (%)0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-14T11:27:24.386295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length78
Mean length29.30921501
Min length5

Characters and Unicode

Total characters5466491
Distinct characters38
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowrushes; angiosperms; tracheophytes; plants
2nd rowgentians; angiosperms; tracheophytes; plants
3rd rowsedges; angiosperms; tracheophytes; plants
4th rowliverworts; mosses; plants
5th rowplants; plants
ValueCountFrequency (%)
plants 205898
36.1%
tracheophytes 104057
18.2%
angiosperms 81757
 
14.3%
mosses 38430
 
6.7%
liverworts 21780
 
3.8%
sedges 13776
 
2.4%
algae 7363
 
1.3%
sunflowers 7281
 
1.3%
grasses 6277
 
1.1%
ferns 5822
 
1.0%
Other values (217) 78501
 
13.7%
2025-01-14T11:27:24.639866image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 752120
13.8%
e 464919
 
8.5%
t 458192
 
8.4%
a 433633
 
7.9%
p 404044
 
7.4%
384431
 
7.0%
; 364831
 
6.7%
n 324111
 
5.9%
o 294499
 
5.4%
r 290716
 
5.3%
Other values (28) 1294995
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4711435
86.2%
Space Separator 384431
 
7.0%
Other Punctuation 367273
 
6.7%
Dash Punctuation 2905
 
0.1%
Uppercase Letter 269
 
< 0.1%
Open Punctuation 89
 
< 0.1%
Close Punctuation 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 752120
16.0%
e 464919
9.9%
t 458192
9.7%
a 433633
9.2%
p 404044
8.6%
n 324111
6.9%
o 294499
 
6.3%
r 290716
 
6.2%
l 264824
 
5.6%
h 223345
 
4.7%
Other values (15) 801032
17.0%
Uppercase Letter
ValueCountFrequency (%)
A 78
29.0%
G 62
23.0%
J 54
20.1%
B 47
17.5%
P 27
 
10.0%
H 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
; 364831
99.3%
, 1369
 
0.4%
' 1073
 
0.3%
Space Separator
ValueCountFrequency (%)
384431
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2905
100.0%
Open Punctuation
ValueCountFrequency (%)
( 89
100.0%
Close Punctuation
ValueCountFrequency (%)
) 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4711704
86.2%
Common 754787
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 752120
16.0%
e 464919
9.9%
t 458192
9.7%
a 433633
9.2%
p 404044
8.6%
n 324111
6.9%
o 294499
 
6.3%
r 290716
 
6.2%
l 264824
 
5.6%
h 223345
 
4.7%
Other values (21) 801301
17.0%
Common
ValueCountFrequency (%)
384431
50.9%
; 364831
48.3%
- 2905
 
0.4%
, 1369
 
0.2%
' 1073
 
0.1%
( 89
 
< 0.1%
) 89
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5466491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 752120
13.8%
e 464919
 
8.5%
t 458192
 
8.4%
a 433633
 
7.9%
p 404044
 
7.4%
384431
 
7.0%
; 364831
 
6.7%
n 324111
 
5.9%
o 294499
 
5.4%
r 290716
 
5.3%
Other values (28) 1294995
23.7%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:24.690045image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters746116
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICBN
2nd rowICBN
3rd rowICBN
4th rowICBN
5th rowICBN
ValueCountFrequency (%)
icbn 186529
100.0%
2025-01-14T11:27:24.783764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 746116
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 746116
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 746116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

taxonRemarks
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-14T11:27:24.826319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters4849754
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimals and Plants: Plants
2nd rowAnimals and Plants: Plants
3rd rowAnimals and Plants: Plants
4th rowAnimals and Plants: Plants
5th rowAnimals and Plants: Plants
ValueCountFrequency (%)
plants 373058
50.0%
animals 186529
25.0%
and 186529
25.0%
2025-01-14T11:27:24.925481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 746116
15.4%
a 746116
15.4%
l 559587
11.5%
s 559587
11.5%
559587
11.5%
P 373058
7.7%
t 373058
7.7%
A 186529
 
3.8%
i 186529
 
3.8%
m 186529
 
3.8%
Other values (2) 373058
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3544051
73.1%
Space Separator 559587
 
11.5%
Uppercase Letter 559587
 
11.5%
Other Punctuation 186529
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 746116
21.1%
a 746116
21.1%
l 559587
15.8%
s 559587
15.8%
t 373058
10.5%
i 186529
 
5.3%
m 186529
 
5.3%
d 186529
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
P 373058
66.7%
A 186529
33.3%
Space Separator
ValueCountFrequency (%)
559587
100.0%
Other Punctuation
ValueCountFrequency (%)
: 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4103638
84.6%
Common 746116
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 746116
18.2%
a 746116
18.2%
l 559587
13.6%
s 559587
13.6%
P 373058
9.1%
t 373058
9.1%
A 186529
 
4.5%
i 186529
 
4.5%
m 186529
 
4.5%
d 186529
 
4.5%
Common
ValueCountFrequency (%)
559587
75.0%
: 186529
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4849754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 746116
15.4%
a 746116
15.4%
l 559587
11.5%
s 559587
11.5%
559587
11.5%
P 373058
7.7%
t 373058
7.7%
A 186529
 
3.8%
i 186529
 
3.8%
m 186529
 
3.8%
Other values (2) 373058
7.7%